DataFrameResampler ================== .. cpp:class:: pandas::DataFrameResampler Window operation class for rolling/expanding calculations. Example ------- .. code-block:: cpp #include using namespace pandas; // Use DataFrameResampler DataFrameResampler obj; // ... operations ... Constructors ------------ .. list-table:: :widths: 55 25 20 :header-rows: 1 * - Signature - Location - Example * - ``DataFrameResampler(const DataFrame& df, const std::string& freq, const std::string& closed = "", const std::string& label = "", const std::string& origin = "epoch", int64_t offset_nanos = 0)`` - pd_resampler.h:308 - Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``DataFrame first() const`` - DataFrame - pd_resampler.h:361 - :ref:`View ` * - ``std::vector get_numeric_columns() const`` - std::vector - pd_resampler.h:424 - * - ``int64_t get_period_key(int64_t epoch_ns) const`` - int64_t - pd_resampler.h:422 - * - ``DataFrame last() const`` - DataFrame - pd_resampler.h:364 - :ref:`View ` Missing Data ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``DataFrame bfill() const`` - DataFrame - pd_resampler.h:387 - :ref:`View ` * - ``DataFrame ffill() const`` - DataFrame - pd_resampler.h:377 - :ref:`View ` Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``DataFrame count() const`` - DataFrame - pd_resampler.h:335 - :ref:`View ` * - ``DataFrame max() const`` - DataFrame - pd_resampler.h:332 - :ref:`View ` * - ``DataFrame mean() const`` - DataFrame - pd_resampler.h:330 - :ref:`View ` * - ``DataFrame median() const`` - DataFrame - pd_resampler.h:336 - :ref:`View ` * - ``DataFrame min() const`` - DataFrame - pd_resampler.h:331 - :ref:`View ` * - ``DataFrame std_(int ddof = 1) const`` - DataFrame - pd_resampler.h:333 - :ref:`View ` * - ``DataFrame sum() const`` - DataFrame - pd_resampler.h:329 - :ref:`View ` * - ``DataFrame var(int ddof = 1) const`` - DataFrame - pd_resampler.h:334 - :ref:`View ` Aggregation ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``DataFrame agg(const std::string& func_name) const`` - DataFrame - pd_resampler.h:342 - :ref:`View ` * - ``DataFrame agg(const std::vector& funcs) const`` - DataFrame - pd_resampler.h:348 - :ref:`View ` * - ``DataFrame agg(const std::vector>>& col_funcs) const`` - DataFrame - pd_resampler.h:354 - :ref:`View ` * - ``std::vector aggregate_column(size_t col_idx, const std::string& func) const`` - std::vector - pd_resampler.h:427 - * - ``DataFrame transform(const std::string& func_name) const`` - DataFrame - pd_resampler.h:384 - :ref:`View ` Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``void build_groups()`` - void - pd_resampler.h:421 - * - ``const std::string& closed() const`` - const std::string& - pd_resampler.h:406 - :ref:`View ` * - ``static double compute_agg(const std::vector& values, const std::string& func, int ddof = 1)`` - static double - pd_resampler.h:430 - :ref:`View ` * - ``const DataFrame& dataframe() const`` - const DataFrame& - pd_resampler.h:403 - :ref:`View ` * - ``const std::string& freq() const`` - const std::string& - pd_resampler.h:400 - :ref:`View ` * - ``const std::vector& group_keys() const`` - const std::vector& - pd_resampler.h:415 - * - ``const std::unordered_map>& groups() const`` - const std::unordered_map>& - pd_resampler.h:412 - :ref:`View ` * - ``const std::string& label() const`` - const std::string& - pd_resampler.h:409 - :ref:`View ` * - ``size_t ngroups() const { return group_keys_order_.size()`` - size_t - pd_resampler.h:397 - :ref:`View ` * - ``DataFrame ohlc() const`` - DataFrame - pd_resampler.h:370 - :ref:`View ` * - ``int64_t period_key_to_timestamp(int64_t key) const`` - int64_t - pd_resampler.h:423 - * - ``const std::vector& period_timestamps() const`` - const std::vector& - pd_resampler.h:418 - * - ``DataFrame size() const`` - DataFrame - pd_resampler.h:394 - :ref:`View ` Code Examples ------------- The following examples are extracted from the test suite. .. _example-dataframeresampler-first-0: .. dropdown:: first (pd_test_1_all.cpp:11616) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11606 :emphasize-lines: 11 void pd_test_groupby_first_last() { std::cout << "========= GroupBy first/last ===================="; std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0}} }; pandas::DataFrame df(data); auto first_result = df.groupby("category").first(); auto last_result = df.groupby("category").last(); // First for group 1: 10, group 2: 30 // Last for group 1: 20, group 2: 40 double first1 = std::stod(first_result["value"].get_value_str(0)); double first2 = std::stod(first_result["value"].get_value_str(1)); bool passed = ((std::abs(first1 - 10.0) < 0.001 && std::abs(first2 - 30.0) < 0.001) || (std::abs(first1 - 30.0) < 0.001 && std::abs(first2 - 10.0) < 0.001)); if (!passed) { .. _example-dataframeresampler-last-1: .. dropdown:: last (pd_test_1_all.cpp:11617) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11607 :emphasize-lines: 11 void pd_test_groupby_first_last() { std::cout << "========= GroupBy first/last ===================="; std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0}} }; pandas::DataFrame df(data); auto first_result = df.groupby("category").first(); auto last_result = df.groupby("category").last(); // First for group 1: 10, group 2: 30 // Last for group 1: 20, group 2: 40 double first1 = std::stod(first_result["value"].get_value_str(0)); double first2 = std::stod(first_result["value"].get_value_str(1)); bool passed = ((std::abs(first1 - 10.0) < 0.001 && std::abs(first2 - 30.0) < 0.001) || (std::abs(first1 - 30.0) < 0.001 && std::abs(first2 - 10.0) < 0.001)); if (!passed) { std::cout << " [FAIL] : in pd_test_groupby_first_last() : first values incorrect" << std::endl; .. _example-dataframeresampler-bfill-2: .. dropdown:: bfill (pd_test_1_all.cpp:23603) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 23593 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_equals test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_equals.cpp (end) ----------------------------- // ------------------- pd_test_ffill_bfill.cpp (start) ----------------------------- // dataframe_tests/pd_test_ffill_bfill.cpp // Test file for DataFrame.ffill() and DataFrame.bfill() methods #include #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives .. _example-dataframeresampler-ffill-3: .. dropdown:: ffill (pd_test_1_all.cpp:23603) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 23593 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_equals test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_equals.cpp (end) ----------------------------- // ------------------- pd_test_ffill_bfill.cpp (start) ----------------------------- // dataframe_tests/pd_test_ffill_bfill.cpp // Test file for DataFrame.ffill() and DataFrame.bfill() methods #include #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives .. _example-dataframeresampler-count-4: .. dropdown:: count (pd_test_1_all.cpp:66) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 56 :emphasize-lines: 11 if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_and() { std::cout << "========= BooleanArray: Kleene AND ======================= "; .. _example-dataframeresampler-max-5: .. dropdown:: max (pd_test_1_all.cpp:771) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 761 :emphasize-lines: 11 pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); .. _example-dataframeresampler-mean-6: .. dropdown:: mean (pd_test_1_all.cpp:282) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 272 :emphasize-lines: 11 std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; .. _example-dataframeresampler-median-7: .. dropdown:: median (pd_test_1_all.cpp:20910) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20900 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_median() { std::cout << "========= Expanding median ======================"; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().median(); // Expanding median: 1, 1.5, 2, 2.5, 3 bool passed = std::abs(result[0] - 1.0) < 0.001 && std::abs(result[1] - 1.5) < 0.001 && std::abs(result[2] - 2.0) < 0.001 && std::abs(result[3] - 2.5) < 0.001 && std::abs(result[4] - 3.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_median() : expanding median values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_median failed: expanding median values incorrect"); .. _example-dataframeresampler-min-8: .. dropdown:: min (pd_test_1_all.cpp:764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 754 :emphasize-lines: 11 } void pd_test_categorical_array_ordered_operations() { std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= "; std::vector cats = {"low", "medium", "high"}; std::vector codes = {0, 2, 1, 0, -1}; // low, high, medium, low, NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); .. _example-dataframeresampler-std_-9: .. dropdown:: std_ (pd_test_1_all.cpp:20752) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20742 :emphasize-lines: 11 throw std::runtime_error("pd_test_rolling_min_periods failed: with min_periods=1, idx 1 should be 3.0"); } std::cout << " -> tests passed" << std::endl; } void pd_test_rolling_std() { std::cout << "========= Rolling std ==========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.rolling(3).std_(); // std([1,2,3]) = 1.0 (ddof=1) // std([2,3,4]) = 1.0 // std([3,4,5]) = 1.0 bool passed = std::abs(result[2] - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_rolling_std() : rolling std should be 1.0" << std::endl; throw std::runtime_error("pd_test_rolling_std failed: rolling std should be 1.0"); } .. _example-dataframeresampler-sum-10: .. dropdown:: sum (pd_test_1_all.cpp:276) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 266 :emphasize-lines: 11 } // Test sum/mean pandas::BooleanArray arr({ std::optional(true), std::optional(false), std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } .. _example-dataframeresampler-var-11: .. dropdown:: var (pd_test_1_all.cpp:20890) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20880 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_std failed: expanding std values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_var() { std::cout << "========= Expanding var ========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().var(); // Expanding var (ddof=1): NaN, 0.5, 1.0, 1.6667, 2.5 bool passed = std::isnan(result[0]) && std::abs(result[1] - 0.5) < 0.001 && std::abs(result[2] - 1.0) < 0.001 && std::abs(result[3] - 1.6667) < 0.001 && std::abs(result[4] - 2.5) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_var() : expanding var values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); .. _example-dataframeresampler-agg-12: .. dropdown:: agg (pd_test_1_all.cpp:11100) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11090 :emphasize-lines: 11 } void pd_test_func_apply_series_agg() { std::cout << "========= Series agg =================================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}, "values"); bool passed = true; // Test string-based aggregation auto sum_result = s.agg("sum"); if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl; throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed"); } auto mean_result = s.agg("mean"); if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl; .. _example-dataframeresampler-agg-13: .. dropdown:: agg (pd_test_1_all.cpp:11100) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11090 :emphasize-lines: 11 } void pd_test_func_apply_series_agg() { std::cout << "========= Series agg =================================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}, "values"); bool passed = true; // Test string-based aggregation auto sum_result = s.agg("sum"); if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl; throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed"); } auto mean_result = s.agg("mean"); if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl; .. _example-dataframeresampler-agg-14: .. dropdown:: agg (pd_test_1_all.cpp:11100) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11090 :emphasize-lines: 11 } void pd_test_func_apply_series_agg() { std::cout << "========= Series agg =================================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}, "values"); bool passed = true; // Test string-based aggregation auto sum_result = s.agg("sum"); if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl; throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed"); } auto mean_result = s.agg("mean"); if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl; .. _example-dataframeresampler-transform-15: .. dropdown:: transform (pd_test_1_all.cpp:11071) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11061 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_func_apply_series_transform() { std::cout << "========= Series transform ============================"; pandas::Series s({1.0, 2.0, 3.0, 4.0}, "values"); // Transform must return same shape auto result = s.transform([](double x) { return x * 2 + 1; }); bool passed = true; if (result.size() != s.size()) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_transform() : size changed" << std::endl; throw std::runtime_error("pd_test_func_apply_series_transform failed: size changed"); } std::vector expected = {3.0, 5.0, 7.0, 9.0}; for (size_t i = 0; i < result.size(); ++i) { .. _example-dataframeresampler-closed-16: .. dropdown:: closed (pd_test_1_all.cpp:1903) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1893 :emphasize-lines: 11 // ============================================================================ void test_constructors() { std::cout << "========= IntervalArray: constructors ======================= "; // Default constructor pandas::IntervalArrayFloat64 empty; if (empty.size() != 0) { std::cout << "[FAIL] : in test_constructors() : default constructor size" << std::endl; return; } if (empty.closed() != pandas::IntervalClosed::Right) { std::cout << "[FAIL] : in test_constructors() : default closure" << std::endl; return; } // Constructor from left/right arrays numpy::NDArray left(std::vector{3}); numpy::NDArray right(std::vector{3}); left.setElementAt({0}, 0.0); right.setElementAt({0}, 1.0); left.setElementAt({1}, 1.0); right.setElementAt({1}, 2.0); left.setElementAt({2}, 2.0); right.setElementAt({2}, 3.0); .. _example-dataframeresampler-compute_agg-17: .. dropdown:: compute_agg (pd_test_5_all.cpp:112204) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 112194 :emphasize-lines: 11 // Default signature is groupby(by, axis, level, as_index, sort, group_keys, observed, dropna). auto gb = df_in.groupby("k", 0, std::nullopt, /*as_index=*/true, /*sort=*/true, /*group_keys=*/true, /*observed=*/false, /*dropna=*/true); pandas::DataFrame df = gb.agg("sum"); std::string actual = df.to_string(); // Pandas oracle (verified by analysis1 H3 logic + compute_agg empty=0.0): // - "a" observed, sum=10 // - "b" observed, sum=20 // - "c" unobserved -> compute_agg(empty, "sum") -> 0 // Plan 12 (Logic-C int widening) has landed: aggregate_column now // preserves int64 for integer inputs, so the oracle is int64 with // integer literal display (no .0 suffix). std::string expected = " v\n" "k \n" "a 10\n" "b 20\n" "c 0"; check_case("groupby_agg_dispatch_7c3a91_case_41", .. _example-dataframeresampler-dataframe-18: .. dropdown:: dataframe (pd_test_2_all.cpp:11742) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11732 :emphasize-lines: 11 std::cout << " [FAIL] : wrong dimensions" << std::endl; std::remove(temp_path.c_str()); throw std::runtime_error("pd_test_to_hdf_mixed_types failed"); } std::remove(temp_path.c_str()); std::cout << " -> tests passed" << std::endl; } void pd_test_to_hdf_empty_dataframe() { std::cout << "========= to_hdf empty dataframe (real HDF5) ==================="; pandas::DataFrame df; std::string temp_path = "temp/test_hdf5_empty.h5"; df.to_hdf(temp_path, "df", "w"); // Just verify file was created std::ifstream file(temp_path); if (!file.is_open()) { std::cout << " [FAIL] : file not created" << std::endl; throw std::runtime_error("pd_test_to_hdf_empty_dataframe failed"); .. _example-dataframeresampler-freq-19: .. dropdown:: freq (pd_test_1_all.cpp:8233) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 8223 :emphasize-lines: 11 std::cout << "========= freq property ==============================="; std::vector> values = { numpy::datetime64(0LL, numpy::DateTimeUnit::Nanosecond), numpy::datetime64(86400000000000LL, numpy::DateTimeUnit::Nanosecond) // 1 day }; pandas::DatetimeArray arr(values); pandas::DatetimeMixinIndex idx(arr); // Default freq is nullopt or inferred auto f = idx.freq(); std::string fs = idx.freqstr(); bool passed = true; // freq may or may not be set if (!passed) { std::cout << " [FAIL] : in pd_test_datetime_mixin_freq()" << std::endl; throw std::runtime_error("pd_test_datetime_mixin_freq failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-dataframeresampler-groups-20: .. dropdown:: groups (pd_test_2_all.cpp:20864) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20854 :emphasize-lines: 11 // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; // Two groups: A=[1,2,3], B=[10,20] std::vector vals = {1.0, 10.0, 2.0, 20.0, 3.0}; pandas::Series data(vals); pandas::Series groups({"A", "B", "A", "B", "A"}); auto sgb = data.groupby(groups); pandas::SeriesGroupByExpandingWindow ew(sgb, 1); auto result = ew.sum(); check(result.size() == 5, "size_5"); // A group: expanding sum = 1, 3, 6 // B group: expanding sum = 10, 30 // Original order: [A:1, B:10, A:3, B:30, A:6] check(approx_eq(result[0], 1.0), "A_exp_sum_0"); .. _example-dataframeresampler-label-21: .. dropdown:: label (pd_test_4_all.cpp:4935) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4925 :emphasize-lines: 11 // Helper: compare and report // ---------------------------------------------------------------------------- static void check_str(const std::string& label, const std::string& expected, const std::string& actual) { int _f = 0; pandas_tests::check_str_ws(label, expected, actual, _f); if (_f > 0) throw std::runtime_error(label + ": str mismatch"); } // Slugify a python compare-test label ("a.b.c" → "a_b_c") matching the // scheme in scripts/gen_repr_mismatch_fixtures.py. static std::string slugify_label(const std::string& label) { std::string out = label; for (char& ch : out) { if (ch == '.') ch = '_'; } return out; } // Load a captured pandas-generated expected output file. The file is written .. _example-dataframeresampler-ngroups-22: .. dropdown:: ngroups (pd_test_1_all.cpp:11497) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11487 :emphasize-lines: 11 // Create DataFrame with category column std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0, 50.0}} }; pandas::DataFrame df(data); // Test groupby auto grouped = df.groupby("category"); bool passed = grouped.ngroups() == 2; if (!passed) { std::cout << " [FAIL] : in pd_test_groupby_basic() : ngroups should be 2" << std::endl; throw std::runtime_error("pd_test_groupby_basic failed: ngroups should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_groupby_multiple_columns() { std::cout << "========= GroupBy multiple columns =============="; .. _example-dataframeresampler-ohlc-23: .. dropdown:: ohlc (pd_test_1_all.cpp:20388) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20378 :emphasize-lines: 11 "2020-01-01 11:00:00", "2020-01-01 12:00:00", "2020-01-02 09:00:00", "2020-01-02 10:00:00", "2020-01-02 11:00:00", "2020-01-02 12:00:00" }; df.set_index(std::make_unique>(dates)); auto resampler = df.resample("D"); pandas::DataFrame result = resampler.ohlc(); // Should have open, high, low, close columns const auto& cols = result.columns(); bool has_open = false, has_high = false, has_low = false, has_close = false; for (size_t i = 0; i < cols.size(); ++i) { std::string c = cols[i]; if (c.find("open") != std::string::npos) has_open = true; if (c.find("high") != std::string::npos) has_high = true; if (c.find("low") != std::string::npos) has_low = true; if (c.find("close") != std::string::npos) has_close = true; .. _example-dataframeresampler-size-24: .. dropdown:: size (pd_test_1_all.cpp:22) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12 :emphasize-lines: 11 #include "../pandas/pd_boolean_array.h" namespace dataframe_tests { namespace dataframe_tests_boolean_array { void pd_test_boolean_array_constructors() { std::cout << "========= BooleanArray: constructors ======================= "; // Default constructor pandas::BooleanArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_boolean_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_boolean_array_constructors failed: default constructor size != 0"); } // Initializer list constructor pandas::BooleanArray arr2({ std::optional(true), std::optional(false), std::nullopt, std::optional(true)