DataFrameGroupBy
================

.. cpp:class:: pandas::DataFrameGroupBy

   GroupBy class for split-apply-combine operations.

Example
-------

.. code-block:: cpp

   #include <pandas/pandas.h>
   using namespace pandas;

   // Use DataFrameGroupBy
   DataFrameGroupBy obj;
   // ... operations ...

Constructors
------------

.. list-table::
   :widths: 55 25 20
   :header-rows: 1

   * - Signature
     - Location
     - Example
   * - ``DataFrameGroupBy(const DataFrame& df, const std::vector<std::string>& by, bool as_index = true, bool sort = true, bool dropna = true, bool observed = true, bool group_keys = true)``
     - pd_groupby.h:100
     - 
   * - ``DataFrameGroupBy(const DataFrame& df, const std::string& by, bool as_index = true, bool sort = true, bool dropna = true, bool observed = true, bool group_keys = true)``
     - pd_groupby.h:111
     - 

Indexing / Selection
--------------------

.. list-table::
   :widths: 40 20 15 25
   :header-rows: 1

   * - Signature
     - Return Type
     - Location
     - Example
   * - ``DataFrame first() const``
     - DataFrame
     - pd_groupby.h:301
     - :ref:`View <example-dataframegroupby-first-0>`
   * - ``std::optional<std::string> first_by_index_name_() const``
     - std::optional<std::string>
     - pd_groupby.h:90
     - 
   * - ``DataFrame get_group(const std::string& key) const``
     - DataFrame
     - pd_groupby.h:323
     - :ref:`View <example-dataframegroupby-get_group-1>`
   * - ``DataFrame get_group(const std::string& key, const std::set<std::string>& exclude_cols) const``
     - DataFrame
     - pd_groupby.h:331
     - :ref:`View <example-dataframegroupby-get_group-2>`
   * - ``std::vector<std::string> get_numeric_value_columns() const``
     - std::vector<std::string>
     - pd_groupby.h:447
     - :ref:`View <example-dataframegroupby-get_numeric_value_columns-3>`
   * - ``std::vector<std::string> get_value_columns(const std::string& agg_name = "") const``
     - std::vector<std::string>
     - pd_groupby.h:453
     - 
   * - ``DataFrame head(int n = 5) const``
     - DataFrame
     - pd_groupby.h:313
     - :ref:`View <example-dataframegroupby-head-4>`
   * - ``DataFrame idxmax(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:465
     - :ref:`View <example-dataframegroupby-idxmax-5>`
   * - ``DataFrame idxmin(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:466
     - :ref:`View <example-dataframegroupby-idxmin-6>`
   * - ``DataFrame idxmin_with_dtype(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:263
     - :ref:`View <example-dataframegroupby-idxmin_with_dtype-7>`
   * - ``DataFrame last() const``
     - DataFrame
     - pd_groupby.h:304
     - :ref:`View <example-dataframegroupby-last-8>`
   * - ``DataFrame tail(int n = 5) const``
     - DataFrame
     - pd_groupby.h:316
     - :ref:`View <example-dataframegroupby-tail-9>`

Data Manipulation
-----------------

.. list-table::
   :widths: 40 20 15 25
   :header-rows: 1

   * - Signature
     - Return Type
     - Location
     - Example
   * - ``bool dropna() const``
     - bool
     - pd_groupby.h:407
     - :ref:`View <example-dataframegroupby-dropna-10>`

Statistics
----------

.. list-table::
   :widths: 40 20 15 25
   :header-rows: 1

   * - Signature
     - Return Type
     - Location
     - Example
   * - ``DataFrame count() const``
     - DataFrame
     - pd_groupby.h:166
     - :ref:`View <example-dataframegroupby-count-11>`
   * - ``DataFrame describe() const``
     - DataFrame
     - pd_groupby.h:171
     - :ref:`View <example-dataframegroupby-describe-12>`
   * - ``DataFrame max(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:163
     - :ref:`View <example-dataframegroupby-max-13>`
   * - ``DataFrame mean(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:161
     - :ref:`View <example-dataframegroupby-mean-14>`
   * - ``DataFrame median(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:167
     - :ref:`View <example-dataframegroupby-median-15>`
   * - ``DataFrame min(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:162
     - :ref:`View <example-dataframegroupby-min-16>`
   * - ``DataFrame nunique(bool dropna = true) const``
     - DataFrame
     - pd_groupby.h:170
     - :ref:`View <example-dataframegroupby-nunique-17>`
   * - ``DataFrame prod(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:168
     - :ref:`View <example-dataframegroupby-prod-18>`
   * - ``DataFrame sem(int ddof = 1, bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:169
     - :ref:`View <example-dataframegroupby-sem-19>`
   * - ``DataFrame std_(int ddof = 1, bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:164
     - :ref:`View <example-dataframegroupby-std_-20>`
   * - ``DataFrame sum(bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:160
     - :ref:`View <example-dataframegroupby-sum-21>`
   * - ``DataFrame var(int ddof = 1, bool numeric_only = false) const``
     - DataFrame
     - pd_groupby.h:165
     - :ref:`View <example-dataframegroupby-var-22>`

Aggregation
-----------

.. list-table::
   :widths: 40 20 15 25
   :header-rows: 1

   * - Signature
     - Return Type
     - Location
     - Example
   * - ``DataFrame agg(const std::string& func_name) const``
     - DataFrame
     - pd_groupby.h:177
     - :ref:`View <example-dataframegroupby-agg-23>`
   * - ``DataFrame agg(const std::vector<std::string>& funcs) const``
     - DataFrame
     - pd_groupby.h:183
     - :ref:`View <example-dataframegroupby-agg-24>`
   * - ``DataFrame agg(const std::vector<std::pair<std::string, std::vector<std::string>>>& col_funcs) const``
     - DataFrame
     - pd_groupby.h:193
     - :ref:`View <example-dataframegroupby-agg-25>`
   * - ``DataFrame agg(const std::map<std::string, std::string>& col_func_map) const``
     - DataFrame
     - pd_groupby.h:204
     - :ref:`View <example-dataframegroupby-agg-26>`
   * - ``DataFrame agg(std::initializer_list<std::pair<std::string, std::vector<std::string>>> col_funcs_init) const``
     - DataFrame
     - pd_groupby.h:234
     - :ref:`View <example-dataframegroupby-agg-27>`
   * - ``PANDASCORE_API Result agg(const FuncArg& func) const``
     - PANDASCORE_API Result
     - pd_groupby.h:352
     - :ref:`View <example-dataframegroupby-agg-28>`
   * - ``DataFrame agg_callable_with_dtype( const std::function<pandas::ApplyCellResult( const pandas::Series<numpy::float64>&)>& cb) const``
     - DataFrame
     - pd_groupby.h:257
     - :ref:`View <example-dataframegroupby-agg_callable_with_dtype-29>`
   * - ``DataFrame agg_impl( const std::vector<std::pair<std::string, std::vector<std::string>>>& col_funcs, bool list_form) const``
     - DataFrame
     - pd_groupby.h:500
     - 
   * - ``DataFrame agg_named(const std::vector<NamedAggSpec>& specs) const``
     - DataFrame
     - pd_groupby.h:339
     - :ref:`View <example-dataframegroupby-agg_named-30>`
   * - ``DataFrame agg_with_dtype(const std::string& how) const``
     - DataFrame
     - pd_groupby.h:248
     - :ref:`View <example-dataframegroupby-agg_with_dtype-31>`
   * - ``DataFrame agg_with_dtype_list(const std::vector<std::string>& funcs) const``
     - DataFrame
     - pd_groupby.h:252
     - :ref:`View <example-dataframegroupby-agg_with_dtype_list-32>`
   * - ``std::vector<double> aggregate_column(size_t col_idx, const std::string& func) const``
     - std::vector<double>
     - pd_groupby.h:621
     - 
   * - ``DataFrame apply(std::function<DataFrame(const DataFrame&)> fn, bool include_groups = true) const``
     - DataFrame
     - pd_groupby.h:282
     - :ref:`View <example-dataframegroupby-apply-33>`
   * - ``Series<numpy::float64> apply_collect_scalar_results( const std::vector<std::string>& keys, const std::vector<double>& values) const``
     - Series<numpy::float64>
     - pd_groupby.h:526
     - :ref:`View <example-dataframegroupby-apply_collect_scalar_results-34>`
   * - ``Series<std::string> apply_collect_scalar_string_results( const std::vector<std::string>& keys, const std::vector<std::string>& values) const``
     - Series<std::string>
     - pd_groupby.h:536
     - 
   * - ``DataFrame apply_collect_series_results( const std::vector<std::string>& keys, const std::vector<std::string>& col_names, const std::map<std::string, std::vector<double>>& num_cols, const std::map<std::string, std::vector<std::string>>& str_cols, const std::string& columns_axis_name = "") const``
     - DataFrame
     - pd_groupby.h:549
     - :ref:`View <example-dataframegroupby-apply_collect_series_results-35>`
   * - ``DataFrame apply_concat_dataframe_results( const std::vector<std::string>& keys, const std::vector<DataFrame>& dfs, bool use_group_keys) const``
     - DataFrame
     - pd_groupby.h:563
     - :ref:`View <example-dataframegroupby-apply_concat_dataframe_results-36>`
   * - ``void apply_int_dtype_if_needed(DataFrame& result, const std::string& result_col, const std::string& source_col, const std::string& func) const``
     - void
     - pd_groupby.h:636
     - 
   * - ``DataFrameGroupByResampler resample(const std::string& rule, const std::string& closed = "left", const std::string& label = "left") const``
     - DataFrameGroupByResampler
     - pd_groupby.h:512
     - :ref:`View <example-dataframegroupby-resample-37>`
   * - ``DataFrame transform_apply_numeric( std::function<std::vector<double>(const std::string&, const Series<numpy::float64>&)> fn) const``
     - DataFrame
     - pd_groupby.h:473
     - 
   * - ``DataFrame transform_concat_results( const std::map<std::string, std::vector<double>>& col_data, const std::vector<std::string>& value_cols) const``
     - DataFrame
     - pd_groupby.h:584
     - 
   * - ``DataFrame transform_named(const std::string& func_name) const``
     - DataFrame
     - pd_groupby.h:593
     - :ref:`View <example-dataframegroupby-transform_named-38>`

Reshaping
---------

.. list-table::
   :widths: 40 20 15 25
   :header-rows: 1

   * - Signature
     - Return Type
     - Location
     - Example
   * - ``squeeze_result(DataFrame& result) const``
     - 
     - pd_groupby.h:441
     - :ref:`View <example-dataframegroupby-squeeze_result-39>`

Other Methods
-------------

.. list-table::
   :widths: 40 20 15 25
   :header-rows: 1

   * - Signature
     - Return Type
     - Location
     - Example
   * - ``bool as_index() const``
     - bool
     - pd_groupby.h:398
     - 
   * - ``void build_groups()``
     - void
     - pd_groupby.h:617
     - 
   * - ``std::vector<std::string> by_column_dtypes() const``
     - std::vector<std::string>
     - pd_groupby.h:388
     - 
   * - ``const std::vector<std::string>& by_columns() const``
     - const std::vector<std::string>&
     - pd_groupby.h:385
     - 
   * - ``std::vector<std::pair<std::string, std::vector<std::string>>> col_funcs( col_funcs_init.begin(), col_funcs_init.end())``
     - std::vector<std::pair<std::string, std::vector<std::string>>>
     - pd_groupby.h:235
     - 
   * - ``DataFrameGroupByColumn<T> column(const std::string& col_name) const``
     - DataFrameGroupByColumn<T>
     - pd_groupby.h:292
     - :ref:`View <example-dataframegroupby-column-40>`
   * - ``static double compute_agg(const std::vector<double>& values, const std::string& func, int ddof = 1)``
     - static double
     - pd_groupby.h:624
     - :ref:`View <example-dataframegroupby-compute_agg-41>`
   * - ``const DataFrame& dataframe() const``
     - const DataFrame&
     - pd_groupby.h:382
     - :ref:`View <example-dataframegroupby-dataframe-42>`
   * - ``DataFrame filter(std::function<bool(const DataFrame&)> predicate) const``
     - DataFrame
     - pd_groupby.h:274
     - :ref:`View <example-dataframegroupby-filter-43>`
   * - ``DataFrame filter_by_group_mask( const std::map<std::string, bool>& group_mask, bool use_dropna = true) const``
     - DataFrame
     - pd_groupby.h:574
     - :ref:`View <example-dataframegroupby-filter_by_group_mask-44>`
   * - ``bool group_keys() const``
     - bool
     - pd_groupby.h:404
     - 
   * - ``const std::vector<std::string>& group_keys_order() const``
     - const std::vector<std::string>&
     - pd_groupby.h:377
     - :ref:`View <example-dataframegroupby-group_keys_order-45>`
   * - ``const std::unordered_map<std::string, std::vector<size_t>>& groups() const``
     - const std::unordered_map<std::string, std::vector<size_t>>&
     - pd_groupby.h:372
     - :ref:`View <example-dataframegroupby-groups-46>`
   * - ``DataFrame idx_extreme_impl_(int which, bool numeric_only) const``
     - DataFrame
     - pd_groupby.h:492
     - 
   * - ``bool list_selected() const``
     - bool
     - pd_groupby.h:413
     - :ref:`View <example-dataframegroupby-list_selected-47>`
   * - ``std::string make_group_key(size_t row_idx) const``
     - std::string
     - pd_groupby.h:618
     - 
   * - ``Series<int64_t> ngroup(bool ascending = true) const``
     - Series<int64_t>
     - pd_groupby.h:359
     - 
   * - ``size_t ngroups() const { return group_keys_order_.size()``
     - size_t
     - pd_groupby.h:369
     - :ref:`View <example-dataframegroupby-ngroups-48>`
   * - ``DataFrame nth(int n) const``
     - DataFrame
     - pd_groupby.h:310
     - :ref:`View <example-dataframegroupby-nth-49>`
   * - ``DataFrame nth(const std::vector<int>& positions, const std::string& dropna_mode = "") const``
     - DataFrame
     - pd_groupby.h:613
     - :ref:`View <example-dataframegroupby-nth-50>`
   * - ``DataFrame nth_by_resolved_slices( const std::vector<std::vector<ResolvedSlice>>& per_group_slices) const``
     - DataFrame
     - pd_groupby.h:488
     - 
   * - ``void rebuild_groups_with_empty_seeds(std::vector<std::string> keys)``
     - void
     - pd_groupby.h:151
     - 
   * - ``DataFrameGroupBy select(const std::vector<std::string>& columns) const``
     - DataFrameGroupBy
     - pd_groupby.h:421
     - :ref:`View <example-dataframegroupby-select-51>`
   * - ``DataFrameGroupBy select_as_list(const std::vector<std::string>& columns) const``
     - DataFrameGroupBy
     - pd_groupby.h:429
     - :ref:`View <example-dataframegroupby-select_as_list-52>`
   * - ``DataFrame select_rows_by_indices( const std::vector<size_t>& row_indices, const std::vector<std::string>& columns = {}, bool exclude_internal = false) const``
     - DataFrame
     - pd_groupby.h:602
     - :ref:`View <example-dataframegroupby-select_rows_by_indices-53>`
   * - ``const std::vector<std::string>& selected_columns() const``
     - const std::vector<std::string>&
     - pd_groupby.h:410
     - :ref:`View <example-dataframegroupby-selected_columns-54>`
   * - ``void set_extra_empty_keys(std::vector<std::string> keys)``
     - void
     - pd_groupby.h:141
     - 
   * - ``void set_owned_df(std::shared_ptr<DataFrame> df)``
     - void
     - pd_groupby.h:123
     - 
   * - ``void set_result_index(DataFrame& result) const``
     - void
     - pd_groupby.h:627
     - 
   * - ``void set_synthetic_freq_key(bool value)``
     - void
     - pd_groupby.h:133
     - 
   * - ``bool should_squeeze_to_series() const``
     - bool
     - pd_groupby.h:416
     - :ref:`View <example-dataframegroupby-should_squeeze_to_series-55>`
   * - ``Series<int64_t> size() const``
     - Series<int64_t>
     - pd_groupby.h:366
     - :ref:`View <example-dataframegroupby-size-56>`
   * - ``bool sort_flag() const``
     - bool
     - pd_groupby.h:401
     - 


Code Examples
-------------

The following examples are extracted from the test suite.

.. _example-dataframegroupby-first-0:

.. dropdown:: first (pd_test_1_all.cpp:11616)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11606
      :emphasize-lines: 11

      
              void pd_test_groupby_first_last() {
                  std::cout << "========= GroupBy first/last ====================";
      
                  std::map<std::string, std::vector<double>> data = {
                      {"category", {1.0, 1.0, 2.0, 2.0}},
                      {"value", {10.0, 20.0, 30.0, 40.0}}
                  };
                  pandas::DataFrame df(data);
      
                  auto first_result = df.groupby("category").first();
                  auto last_result = df.groupby("category").last();
      
                  // First for group 1: 10, group 2: 30
                  // Last for group 1: 20, group 2: 40
                  double first1 = std::stod(first_result["value"].get_value_str(0));
                  double first2 = std::stod(first_result["value"].get_value_str(1));
      
                  bool passed = ((std::abs(first1 - 10.0) < 0.001 && std::abs(first2 - 30.0) < 0.001) ||
                                (std::abs(first1 - 30.0) < 0.001 && std::abs(first2 - 10.0) < 0.001));
                  if (!passed) {

.. _example-dataframegroupby-get_group-1:

.. dropdown:: get_group (pd_test_2_all.cpp:20487)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20477
      :emphasize-lines: 11

              ++g_fail;
          }
      }
      
      static bool approx_eq(double a, double b, double tol = 1e-9) {
          if (std::isnan(a) && std::isnan(b)) return true;
          return std::abs(a - b) < tol;
      }
      
      // =====================================================================
      // Test: get_group() with exclude_cols removes groupby columns
      // =====================================================================
      
      void pd_test_groupby_apply_get_group_exclude() {
          std::cout << "  -- pd_test_groupby_apply_get_group_exclude --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("key", std::vector<std::string>{"a", "a", "b", "b"});
          df.add_column("val1", std::vector<double>{1.0, 2.0, 3.0, 4.0});
          df.add_column("val2", std::vector<double>{10.0, 20.0, 30.0, 40.0});
      

.. _example-dataframegroupby-get_group-2:

.. dropdown:: get_group (pd_test_2_all.cpp:20487)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20477
      :emphasize-lines: 11

              ++g_fail;
          }
      }
      
      static bool approx_eq(double a, double b, double tol = 1e-9) {
          if (std::isnan(a) && std::isnan(b)) return true;
          return std::abs(a - b) < tol;
      }
      
      // =====================================================================
      // Test: get_group() with exclude_cols removes groupby columns
      // =====================================================================
      
      void pd_test_groupby_apply_get_group_exclude() {
          std::cout << "  -- pd_test_groupby_apply_get_group_exclude --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("key", std::vector<std::string>{"a", "a", "b", "b"});
          df.add_column("val1", std::vector<double>{1.0, 2.0, 3.0, 4.0});
          df.add_column("val2", std::vector<double>{10.0, 20.0, 30.0, 40.0});
      

.. _example-dataframegroupby-get_numeric_value_columns-3:

.. dropdown:: get_numeric_value_columns (pd_test_5_all.cpp:36793)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 36783
      :emphasize-lines: 11

      }
      
      void case_1_groupby_numeric_columns_Int64() {
          const std::string tag = "[X1]";
          try {
              pandas::DataFrame df;
              df.add_column<std::string>("g", {"a","a","b","b"});
              df.add_column_nullable<int64_t>("v_Int64", {1, 2, 3, 4});
              df.add_column<double>("v_Float64", {1.0, 2.0, 3.0, 4.0});
              auto gb = df.groupby(std::vector<std::string>{"g"});
              auto cols = gb.get_numeric_value_columns();
              std::cout << tag << " numeric_cols.size=" << cols.size();
              for (auto& c : cols) std::cout << " [" << c << "]";
              std::cout << "\n";
              bool has_Int64 = std::find(cols.begin(), cols.end(), std::string("v_Int64")) != cols.end();
              std::cout << tag << " has_Int64=" << has_Int64 << "\n";
          } catch (const std::exception& e) {
              std::cout << tag << " exception: " << e.what() << "\n";
          }
      }
      

.. _example-dataframegroupby-head-4:

.. dropdown:: head (pd_test_1_all.cpp:6301)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 6291
      :emphasize-lines: 11

              void pd_test_dataframe_indexing() {
                  std::cout << "========= indexing (loc/iloc) ==============";
      
                  std::map<std::string, std::vector<numpy::float64>> data;
                  data["A"] = {10.0, 20.0, 30.0, 40.0, 50.0};
                  data["B"] = {1.0, 2.0, 3.0, 4.0, 5.0};
      
                  pandas::DataFrame df(data);
      
                  // Test head
                  auto head_df = df.head(3);
                  if (head_df.nrows() != 3) {
                      std::cout << "  [FAIL] : in pd_test_dataframe_indexing() : head(3) nrows != 3" << std::endl;
                      throw std::runtime_error("pd_test_dataframe_indexing failed: head(3) nrows != 3");
                  }
      
                  // Test tail
                  auto tail_df = df.tail(2);
                  if (tail_df.nrows() != 2) {
                      std::cout << "  [FAIL] : in pd_test_dataframe_indexing() : tail(2) nrows != 2" << std::endl;
                      throw std::runtime_error("pd_test_dataframe_indexing failed: tail(2) nrows != 2");

.. _example-dataframegroupby-idxmax-5:

.. dropdown:: idxmax (pd_test_1_all.cpp:23956)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 23946
      :emphasize-lines: 11

      
              std::cout << "====================================== [OK] pd_test_ffill_bfill test suite ========================== " << std::endl;
              return 0;
          }
      
      } // namespace dataframe_tests
      // ------------------- pd_test_ffill_bfill.cpp (end) -----------------------------
      
      // ------------------- pd_test_idxmax_idxmin.cpp (start) -----------------------------
      // dataframe_tests/pd_test_idxmax_idxmin.cpp
      // Test for DataFrame.idxmax() and idxmin() methods
      
      #include <iostream>
      #include <stdexcept>
      #include <cmath>
      #include <limits>
      #include "../pandas/pd_dataframe.h"
      
      // CRITICAL: No using namespace directives
      
      namespace dataframe_tests {

.. _example-dataframegroupby-idxmin-6:

.. dropdown:: idxmin (pd_test_1_all.cpp:23956)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 23946
      :emphasize-lines: 11

      
              std::cout << "====================================== [OK] pd_test_ffill_bfill test suite ========================== " << std::endl;
              return 0;
          }
      
      } // namespace dataframe_tests
      // ------------------- pd_test_ffill_bfill.cpp (end) -----------------------------
      
      // ------------------- pd_test_idxmax_idxmin.cpp (start) -----------------------------
      // dataframe_tests/pd_test_idxmax_idxmin.cpp
      // Test for DataFrame.idxmax() and idxmin() methods
      
      #include <iostream>
      #include <stdexcept>
      #include <cmath>
      #include <limits>
      #include "../pandas/pd_dataframe.h"
      
      // CRITICAL: No using namespace directives
      
      namespace dataframe_tests {

.. _example-dataframegroupby-idxmin_with_dtype-7:

.. dropdown:: idxmin_with_dtype (pd_test_5_all.cpp:95397)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 95387
      :emphasize-lines: 11

      
      void case_701_dfgb_idxmin_rangeindex(int& local_fail) {
          std::cout << "-- case_701_dfgb_idxmin_rangeindex\n";
          // Default RangeIndex (int64). Result columns must keep int64 dtype.
          pandas::DataFrame df;
          df.add_column<double>("v", std::vector<double>{3.0, 1.0, 2.0, 0.5});
          df.add_column<int64_t>("key", std::vector<int64_t>{0, 0, 1, 1});
          auto gb = df.groupby("key");
          pandas::DataFrame out;
          std::string err;
          try { out = gb.idxmin_with_dtype(); }
          catch (const std::exception& e) { err = e.what(); }
          catch (...) { err = "<unknown>"; }
          pandas_tests::check(err.empty(),
              "C_26_case_701_dfgb_idxmin_rangeindex()_no_throw", local_fail);
          if (!err.empty()) { std::cout << "  err: " << err << "\n"; return; }
          std::string got = df_col_dtype(out, "v");
          bool ok = (got == "int64");
          pandas_tests::check(ok,
              "C_26_case_701_dfgb_idxmin_rangeindex()_dtype", local_fail);
          if (!ok) std::cout << "  got=[" << got << "] expected=[int64]\n";

.. _example-dataframegroupby-last-8:

.. dropdown:: last (pd_test_1_all.cpp:11617)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11607
      :emphasize-lines: 11

              void pd_test_groupby_first_last() {
                  std::cout << "========= GroupBy first/last ====================";
      
                  std::map<std::string, std::vector<double>> data = {
                      {"category", {1.0, 1.0, 2.0, 2.0}},
                      {"value", {10.0, 20.0, 30.0, 40.0}}
                  };
                  pandas::DataFrame df(data);
      
                  auto first_result = df.groupby("category").first();
                  auto last_result = df.groupby("category").last();
      
                  // First for group 1: 10, group 2: 30
                  // Last for group 1: 20, group 2: 40
                  double first1 = std::stod(first_result["value"].get_value_str(0));
                  double first2 = std::stod(first_result["value"].get_value_str(1));
      
                  bool passed = ((std::abs(first1 - 10.0) < 0.001 && std::abs(first2 - 30.0) < 0.001) ||
                                (std::abs(first1 - 30.0) < 0.001 && std::abs(first2 - 10.0) < 0.001));
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_groupby_first_last() : first values incorrect" << std::endl;

.. _example-dataframegroupby-tail-9:

.. dropdown:: tail (pd_test_1_all.cpp:6308)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 6298
      :emphasize-lines: 11

                  pandas::DataFrame df(data);
      
                  // Test head
                  auto head_df = df.head(3);
                  if (head_df.nrows() != 3) {
                      std::cout << "  [FAIL] : in pd_test_dataframe_indexing() : head(3) nrows != 3" << std::endl;
                      throw std::runtime_error("pd_test_dataframe_indexing failed: head(3) nrows != 3");
                  }
      
                  // Test tail
                  auto tail_df = df.tail(2);
                  if (tail_df.nrows() != 2) {
                      std::cout << "  [FAIL] : in pd_test_dataframe_indexing() : tail(2) nrows != 2" << std::endl;
                      throw std::runtime_error("pd_test_dataframe_indexing failed: tail(2) nrows != 2");
                  }
      
                  // Test iloc_rows range
                  auto slice = df.iloc_rows(1, 4);
                  if (slice.nrows() != 3) {
                      std::cout << "  [FAIL] : in pd_test_dataframe_indexing() : iloc_rows(1,4) nrows != 3" << std::endl;
                      throw std::runtime_error("pd_test_dataframe_indexing failed: iloc_rows(1,4) nrows != 3");

.. _example-dataframegroupby-dropna-10:

.. dropdown:: dropna (pd_test_1_all.cpp:531)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 521
      :emphasize-lines: 11

              }
      
              // Test isna array
              numpy::NDArray<numpy::bool_> na_mask = arr.isna();
              if (na_mask.getSize() != 4) {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_na_handling() : isna size != 4" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_na_handling failed: isna size != 4");
              }
      
              // Test dropna
              pandas::CategoricalArray dropped = arr.dropna();
              if (dropped.size() != 2) {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_na_handling() : dropna size != 2" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_na_handling failed: dropna size != 2");
              }
      
              // Test fillna (fill with existing category)
              pandas::CategoricalArray filled = arr.fillna("a");  // 'a' is in categories
              if (filled.has_na()) {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_na_handling() : fillna should have no NA" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_na_handling failed: fillna should have no NA");

.. _example-dataframegroupby-count-11:

.. dropdown:: count (pd_test_1_all.cpp:66)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 56
      :emphasize-lines: 11

              if (arr.is_na(0)) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false");
              }
      
              if (!arr.has_na()) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true");
              }
      
              if (arr.count() != 2) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2");
              }
      
              std::cout << " -> tests passed" << std::endl;
          }
      
          void pd_test_boolean_array_kleene_and() {
              std::cout << "========= BooleanArray: Kleene AND ======================= ";
      

.. _example-dataframegroupby-describe-12:

.. dropdown:: describe (pd_test_2_all.cpp:19793)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 19783
      :emphasize-lines: 11

              ++g_fail;
          }
      }
      
      static bool approx_eq(double a, double b, double tol = 1e-9) {
          if (std::isnan(a) && std::isnan(b)) return true;
          return std::abs(a - b) < tol;
      }
      
      // =====================================================================
      // Test: describe() default mode — numeric columns only
      // =====================================================================
      
      void pd_test_describe_numeric_only() {
          std::cout << "  -- pd_test_describe_numeric_only --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("A", std::vector<double>{1.0, 2.0, 3.0, 4.0, 5.0});
          df.add_column("B", std::vector<double>{10.0, 20.0, 30.0, 40.0, 50.0});
          df.add_column("Name", std::vector<std::string>{"a", "b", "c", "d", "e"});
      

.. _example-dataframegroupby-max-13:

.. dropdown:: max (pd_test_1_all.cpp:771)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 761
      :emphasize-lines: 11

              pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true);  // ordered
      
              // Test min
              std::optional<std::string> min_val = arr.min();
              if (!min_val.has_value() || *min_val != "low") {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'");
              }
      
              // Test max
              std::optional<std::string> max_val = arr.max();
              if (!max_val.has_value() || *max_val != "high") {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'");
              }
      
              // Test unordered throws for min/max
              pandas::CategoricalArray unordered = arr.as_unordered();
              bool threw = false;
              try {
                  unordered.min();

.. _example-dataframegroupby-mean-14:

.. dropdown:: mean (pd_test_1_all.cpp:282)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 272
      :emphasize-lines: 11

                  std::optional<bool>(true),
                  std::optional<bool>(true)
              });
      
              auto s = arr.sum();
              if (!s.has_value() || s.value() != 3) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_reductions failed: sum");
              }
      
              auto m = arr.mean();
              if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_reductions failed: mean");
              }
      
              std::cout << " -> tests passed" << std::endl;
          }
      
          void pd_test_boolean_array_dtype() {
              std::cout << "========= BooleanArray: dtype ======================= ";

.. _example-dataframegroupby-median-15:

.. dropdown:: median (pd_test_1_all.cpp:20910)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20900
      :emphasize-lines: 11

                      throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect");
                  }
      
                  std::cout << " -> tests passed" << std::endl;
              }
      
              void pd_test_expanding_median() {
                  std::cout << "========= Expanding median ======================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0});
                  auto result = s.expanding().median();
      
                  // Expanding median: 1, 1.5, 2, 2.5, 3
                  bool passed = std::abs(result[0] - 1.0) < 0.001 &&
                                std::abs(result[1] - 1.5) < 0.001 &&
                                std::abs(result[2] - 2.0) < 0.001 &&
                                std::abs(result[3] - 2.5) < 0.001 &&
                                std::abs(result[4] - 3.0) < 0.001;
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_expanding_median() : expanding median values incorrect" << std::endl;
                      throw std::runtime_error("pd_test_expanding_median failed: expanding median values incorrect");

.. _example-dataframegroupby-min-16:

.. dropdown:: min (pd_test_1_all.cpp:764)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 754
      :emphasize-lines: 11

          }
      
          void pd_test_categorical_array_ordered_operations() {
              std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= ";
      
              std::vector<std::string> cats = {"low", "medium", "high"};
              std::vector<numpy::int32> codes = {0, 2, 1, 0, -1};  // low, high, medium, low, NA
              pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true);  // ordered
      
              // Test min
              std::optional<std::string> min_val = arr.min();
              if (!min_val.has_value() || *min_val != "low") {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'");
              }
      
              // Test max
              std::optional<std::string> max_val = arr.max();
              if (!max_val.has_value() || *max_val != "high") {
                  std::cout << "  [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl;
                  throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'");

.. _example-dataframegroupby-nunique-17:

.. dropdown:: nunique (pd_test_1_all.cpp:10604)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 10594
      :emphasize-lines: 11

      
          std::cout << " -> tests passed" << std::endl;
      }
      
      void pd_test_extension_index_nunique() {
          std::cout << "========= nunique =========================";
      
          pandas::CategoricalArray arr({"a", "b", "a", "c", "b", std::nullopt});
          pandas::CategoricalIndex idx(arr);
      
          bool passed = (idx.nunique(true) == 3 && idx.nunique(false) == 4);
          if (!passed) {
              std::cout << "  [FAIL] : in pd_test_extension_index_nunique() : nunique check failed" << std::endl;
              throw std::runtime_error("pd_test_extension_index_nunique failed");
          }
      
          std::cout << " -> tests passed" << std::endl;
      }
      
      void pd_test_extension_index_factorize() {
          std::cout << "========= factorize =========================";

.. _example-dataframegroupby-prod-18:

.. dropdown:: prod (pd_test_1_all.cpp:26082)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 26072
      :emphasize-lines: 11

      
              std::cout << "====================================== [OK] pd_test_pivot_table test suite ========================== " << std::endl;
              return 0;
          }
      
      } // namespace dataframe_tests
      // ------------------- pd_test_pivot_table.cpp (end) -----------------------------
      
      // ------------------- pd_test_prod.cpp (start) -----------------------------
      // dataframe_tests/pd_test_prod.cpp
      // Tests for DataFrame.prod() and DataFrame.prod_cols() methods
      
      #include <iostream>
      #include <stdexcept>
      #include <cmath>
      #include <limits>
      #include "../pandas/pd_dataframe.h"
      
      // CRITICAL: No using namespace directives
      
      namespace dataframe_tests {

.. _example-dataframegroupby-sem-19:

.. dropdown:: sem (pd_test_1_all.cpp:4525)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 4515
      :emphasize-lines: 11

      #include "../pandas/pd_dataframe.h"
      #include "../pandas/pd_series.h"
      
      namespace dataframe_tests {
          namespace dataframe_tests_aggregation {
      
              void pd_test_aggregation_series_sem() {
                  std::cout << "========= Series sem ============================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0});
                  auto sem_val = s.sem();
                  // std(ddof=1) = sqrt(2.5), sem = sqrt(2.5)/sqrt(5) ≈ 0.707
                  bool passed = sem_val.has_value() && std::abs(*sem_val - 0.707) < 0.01;
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_aggregation_series_sem() : sem value incorrect" << std::endl;
                      throw std::runtime_error("pd_test_aggregation_series_sem failed: sem value incorrect");
                  }
      
                  std::cout << " -> tests passed" << std::endl;
              }
      

.. _example-dataframegroupby-std_-20:

.. dropdown:: std_ (pd_test_1_all.cpp:20752)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20742
      :emphasize-lines: 11

                      throw std::runtime_error("pd_test_rolling_min_periods failed: with min_periods=1, idx 1 should be 3.0");
                  }
      
                  std::cout << " -> tests passed" << std::endl;
              }
      
              void pd_test_rolling_std() {
                  std::cout << "========= Rolling std ===========================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0});
                  auto result = s.rolling(3).std_();
      
                  // std([1,2,3]) = 1.0 (ddof=1)
                  // std([2,3,4]) = 1.0
                  // std([3,4,5]) = 1.0
                  bool passed = std::abs(result[2] - 1.0) < 0.001;
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_rolling_std() : rolling std should be 1.0" << std::endl;
                      throw std::runtime_error("pd_test_rolling_std failed: rolling std should be 1.0");
                  }
      

.. _example-dataframegroupby-sum-21:

.. dropdown:: sum (pd_test_1_all.cpp:276)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 266
      :emphasize-lines: 11

              }
      
              // Test sum/mean
              pandas::BooleanArray arr({
                  std::optional<bool>(true),
                  std::optional<bool>(false),
                  std::optional<bool>(true),
                  std::optional<bool>(true)
              });
      
              auto s = arr.sum();
              if (!s.has_value() || s.value() != 3) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_reductions failed: sum");
              }
      
              auto m = arr.mean();
              if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_reductions failed: mean");
              }

.. _example-dataframegroupby-var-22:

.. dropdown:: var (pd_test_1_all.cpp:20890)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20880
      :emphasize-lines: 11

                      throw std::runtime_error("pd_test_expanding_std failed: expanding std values incorrect");
                  }
      
                  std::cout << " -> tests passed" << std::endl;
              }
      
              void pd_test_expanding_var() {
                  std::cout << "========= Expanding var =========================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0});
                  auto result = s.expanding().var();
      
                  // Expanding var (ddof=1): NaN, 0.5, 1.0, 1.6667, 2.5
                  bool passed = std::isnan(result[0]) &&
                                std::abs(result[1] - 0.5) < 0.001 &&
                                std::abs(result[2] - 1.0) < 0.001 &&
                                std::abs(result[3] - 1.6667) < 0.001 &&
                                std::abs(result[4] - 2.5) < 0.001;
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_expanding_var() : expanding var values incorrect" << std::endl;
                      throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect");

.. _example-dataframegroupby-agg-23:

.. dropdown:: agg (pd_test_1_all.cpp:11100)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11090
      :emphasize-lines: 11

              }
      
              void pd_test_func_apply_series_agg() {
                  std::cout << "========= Series agg ==================================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0}, "values");
      
                  bool passed = true;
      
                  // Test string-based aggregation
                  auto sum_result = s.agg("sum");
                  if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl;
                      throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed");
                  }
      
                  auto mean_result = s.agg("mean");
                  if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl;

.. _example-dataframegroupby-agg-24:

.. dropdown:: agg (pd_test_1_all.cpp:11100)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11090
      :emphasize-lines: 11

              }
      
              void pd_test_func_apply_series_agg() {
                  std::cout << "========= Series agg ==================================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0}, "values");
      
                  bool passed = true;
      
                  // Test string-based aggregation
                  auto sum_result = s.agg("sum");
                  if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl;
                      throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed");
                  }
      
                  auto mean_result = s.agg("mean");
                  if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl;

.. _example-dataframegroupby-agg-25:

.. dropdown:: agg (pd_test_1_all.cpp:11100)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11090
      :emphasize-lines: 11

              }
      
              void pd_test_func_apply_series_agg() {
                  std::cout << "========= Series agg ==================================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0}, "values");
      
                  bool passed = true;
      
                  // Test string-based aggregation
                  auto sum_result = s.agg("sum");
                  if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl;
                      throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed");
                  }
      
                  auto mean_result = s.agg("mean");
                  if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl;

.. _example-dataframegroupby-agg-26:

.. dropdown:: agg (pd_test_1_all.cpp:11100)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11090
      :emphasize-lines: 11

              }
      
              void pd_test_func_apply_series_agg() {
                  std::cout << "========= Series agg ==================================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0}, "values");
      
                  bool passed = true;
      
                  // Test string-based aggregation
                  auto sum_result = s.agg("sum");
                  if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl;
                      throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed");
                  }
      
                  auto mean_result = s.agg("mean");
                  if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl;

.. _example-dataframegroupby-agg-27:

.. dropdown:: agg (pd_test_1_all.cpp:11100)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11090
      :emphasize-lines: 11

              }
      
              void pd_test_func_apply_series_agg() {
                  std::cout << "========= Series agg ==================================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0}, "values");
      
                  bool passed = true;
      
                  // Test string-based aggregation
                  auto sum_result = s.agg("sum");
                  if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl;
                      throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed");
                  }
      
                  auto mean_result = s.agg("mean");
                  if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl;

.. _example-dataframegroupby-agg-28:

.. dropdown:: agg (pd_test_1_all.cpp:11100)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11090
      :emphasize-lines: 11

              }
      
              void pd_test_func_apply_series_agg() {
                  std::cout << "========= Series agg ==================================";
      
                  pandas::Series<double> s({1.0, 2.0, 3.0, 4.0, 5.0}, "values");
      
                  bool passed = true;
      
                  // Test string-based aggregation
                  auto sum_result = s.agg("sum");
                  if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl;
                      throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed");
                  }
      
                  auto mean_result = s.agg("mean");
                  if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) {
                      passed = false;
                      std::cout << "  [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl;

.. _example-dataframegroupby-agg_callable_with_dtype-29:

.. dropdown:: agg_callable_with_dtype (pd_test_5_all.cpp:95045)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 95035
      :emphasize-lines: 11

          run_sgb_case("count", "object:bool", "int64",
              "C_26_case_412_sgb_count_objbool()", lf); }
      
      void case_501_callable_int_returns_int64(int& local_fail) {
          std::cout << "-- case_501_callable_int_returns_int64\n";
          pandas::DataFrame df = make_mixed_df();
          auto gb = df.groupby("key");
          pandas::DataFrame out;
          std::string err;
          try {
              out = gb.agg_callable_with_dtype(make_int_callable(42));
          } catch (const std::exception& e) {
              err = e.what();
          } catch (...) {
              err = "<unknown>";
          }
          pandas_tests::check(err.empty(),
              "C_26_case_501_callable_int_returns_int64()_no_throw",
              local_fail);
          if (!err.empty()) {
              std::cout << "  err: " << err << "\n";

.. _example-dataframegroupby-agg_named-30:

.. dropdown:: agg_named (pd_test_2_all.cpp:20534)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20524
      :emphasize-lines: 11

          check(approx_eq(sub_b["val1"].get_value_double(0), 3.0), "get_group_b_val1_r0");
          check(approx_eq(sub_b["val1"].get_value_double(1), 4.0), "get_group_b_val1_r1");
      
          // Empty exclude_cols: same as no-exclude overload
          std::set<std::string> empty_exclude;
          auto sub_empty = gb.get_group("a", empty_exclude);
          check(sub_empty.ncols() == 3, "get_group_empty_excl_cols_3");
      }
      
      // =====================================================================
      // Test: agg_named() basic execution
      // =====================================================================
      
      void pd_test_groupby_apply_named_agg_basic() {
          std::cout << "  -- pd_test_groupby_apply_named_agg_basic --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("key", std::vector<std::string>{"a", "a", "b", "b"});
          df.add_column("val", std::vector<double>{1.0, 3.0, 5.0, 7.0});
      
          auto gb = df.groupby("key");

.. _example-dataframegroupby-agg_with_dtype-31:

.. dropdown:: agg_with_dtype (pd_test_5_all.cpp:94652)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 94642
      :emphasize-lines: 11

      static void run_dfgb_case(const std::string& fn,
                                const std::string& col,
                                const std::string& expected_dtype,
                                const std::string& label,
                                int& local_fail) {
          pandas::DataFrame df = make_mixed_df();
          auto gb = df.groupby("key");
          pandas::DataFrame out;
          std::string err;
          try {
              out = gb.agg_with_dtype(fn);
          } catch (const std::exception& e) {
              err = e.what();
          } catch (...) {
              err = "<unknown>";
          }
          pandas_tests::check(err.empty(),
              label + "_no_throw",
              local_fail);
          if (!err.empty()) {
              std::cout << "  err: " << err << "\n";

.. _example-dataframegroupby-agg_with_dtype_list-32:

.. dropdown:: agg_with_dtype_list (pd_test_5_all.cpp:94682)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 94672
      :emphasize-lines: 11

      static void run_dfgb_list_case(const std::vector<std::string>& fns,
                                     const std::string& src_col,
                                     const std::vector<std::string>& expected,
                                     const std::string& label,
                                     int& local_fail) {
          pandas::DataFrame df = make_mixed_df();
          auto gb = df.groupby("key");
          pandas::DataFrame out;
          std::string err;
          try {
              out = gb.agg_with_dtype_list(fns);
          } catch (const std::exception& e) {
              err = e.what();
          } catch (...) {
              err = "<unknown>";
          }
          pandas_tests::check(err.empty(),
              label + "_no_throw",
              local_fail);
          if (!err.empty()) {
              std::cout << "  err: " << err << "\n";

.. _example-dataframegroupby-apply-33:

.. dropdown:: apply (pd_test_1_all.cpp:11244)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11234
      :emphasize-lines: 11

              void pd_test_func_apply_dataframe_apply_axis0() {
                  std::cout << "========= DataFrame apply axis=0 ======================";
      
                  std::map<std::string, std::vector<double>> data = {
                      {"A", {1.0, 2.0, 3.0}},
                      {"B", {4.0, 5.0, 6.0}}
                  };
                  pandas::DataFrame df(data);
      
                  // apply axis=0 applies function to each column
                  auto result = df.apply([](const std::vector<double>& col) {
                      return std::accumulate(col.begin(), col.end(), 0.0);
                  }, 0);
      
                  bool passed = true;
      
                  // Plan F·dtype: axis=0 reduce now returns a single "result" column
                  // with the original column names ("A", "B") as the row index.
                  // Sum of A: 1+2+3=6, Sum of B: 4+5+6=15
                  const auto& result_col = result["result"];
                  double sum_a = std::stod(result_col.get_value_str(0));

.. _example-dataframegroupby-apply_collect_scalar_results-34:

.. dropdown:: apply_collect_scalar_results (pd_test_3_all.cpp:27341)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27331
      :emphasize-lines: 11

          std::vector<double> values;
          for (const auto& key : keys) {
              auto sub = gb.get_group(key);
              double sum = 0;
              for (size_t r = 0; r < sub.nrows(); ++r) {
                  sum += sub["B"].get_value_double(r);
              }
              values.push_back(sum);
          }
      
          auto result = gb.apply_collect_scalar_results(keys, values);
          check(result.size() == keys.size(), "scalar results size matches keys size");
      
          bool found_bar = false, found_foo = false;
          for (size_t i = 0; i < result.size(); ++i) {
              std::string idx = result.index().get_value_str(i);
              if (idx == "bar") { check(result[i] == 6.0, "bar sum = 6"); found_bar = true; }
              if (idx == "foo") { check(result[i] == 9.0, "foo sum = 9"); found_foo = true; }
          }
          check(found_bar, "bar key found");
          check(found_foo, "foo key found");

.. _example-dataframegroupby-apply_collect_series_results-35:

.. dropdown:: apply_collect_series_results (pd_test_3_all.cpp:27376)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27366
      :emphasize-lines: 11

              auto sub = gb.get_group(key);
              double b_sum = 0, c_sum = 0;
              for (size_t r = 0; r < sub.nrows(); ++r) {
                  b_sum += sub["B"].get_value_double(r);
                  c_sum += sub["C"].get_value_double(r);
              }
              num_cols["B"].push_back(b_sum / sub.nrows());
              num_cols["C"].push_back(c_sum / sub.nrows());
          }
      
          auto result = gb.apply_collect_series_results(keys, col_names, num_cols, str_cols);
          check(result.ncols() == 2, "series results has 2 columns");
          check(result.nrows() == keys.size(), "series results has correct rows");
          check(result.has_column("B"), "has column B");
          check(result.has_column("C"), "has column C");
      }
      
      void pd_test_gb_apply_dataframe_results() {
          std::cout << "  -- pd_test_gb_apply_dataframe_results --" << std::endl;
      
          auto df = make_test_df();

.. _example-dataframegroupby-apply_concat_dataframe_results-36:

.. dropdown:: apply_concat_dataframe_results (pd_test_3_all.cpp:27398)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27388
      :emphasize-lines: 11

      
          std::vector<std::string> keys = gb.group_keys_order();
          std::vector<pandas::DataFrame> dfs;
          std::set<std::string> exclude;
          exclude.insert("A");
      
          for (const auto& key : keys) {
              dfs.push_back(gb.get_group(key, exclude));
          }
      
          auto result_gk = gb.apply_concat_dataframe_results(keys, dfs, true);
          check(result_gk.nrows() == df.nrows(), "concat with MI has all rows");
          check(result_gk.has_multiindex(), "concat with group_keys=true has MultiIndex");
      
          auto result_no_gk = gb.apply_concat_dataframe_results(keys, dfs, false);
          check(result_no_gk.nrows() == df.nrows(), "concat without MI has all rows");
      }
      
      void pd_test_gb_filter_basic() {
          std::cout << "  -- pd_test_gb_filter_basic --" << std::endl;
      

.. _example-dataframegroupby-resample-37:

.. dropdown:: resample (pd_test_1_all.cpp:20321)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20311
      :emphasize-lines: 11

                      "2020-01-01 00:00:00",
                      "2020-01-01 12:00:00",
                      "2020-01-02 00:00:00",
                      "2020-01-02 12:00:00",
                      "2020-01-03 00:00:00",
                      "2020-01-03 12:00:00"
                  };
                  df.set_index(std::make_unique<pandas::Index<std::string>>(dates));
      
                  // Resample to daily
                  auto resampler = df.resample("D");
                  pandas::DataFrame result = resampler.sum();
      
                  // Check that we got aggregated results
                  bool passed = (result.nrows() <= df.nrows());
      
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_timeseries_resample_basic() : resample didn't reduce rows" << std::endl;
                      throw std::runtime_error("pd_test_timeseries_resample_basic failed");
                  }
      

.. _example-dataframegroupby-transform_named-38:

.. dropdown:: transform_named (pd_test_3_all.cpp:27465)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27455
      :emphasize-lines: 11

          auto result_nodrop = gb.filter_by_group_mask(mask, false);
          check(result_nodrop.nrows() == 5, "dropna=false keeps all rows");
      }
      
      void pd_test_gb_transform_same_shape() {
          std::cout << "  -- pd_test_gb_transform_same_shape --" << std::endl;
      
          auto df = make_test_df();
          auto gb = df.groupby("A");
      
          auto result = gb.transform_named("sum");
          check(result.nrows() == df.nrows(), "transform sum same nrows as input");
          check(result["B"].get_value_double(0) == 9.0, "row 0 (foo) B sum = 9");
          check(result["B"].get_value_double(1) == 6.0, "row 1 (bar) B sum = 6");
          check(result["B"].get_value_double(2) == 9.0, "row 2 (foo) B sum = 9");
      
          auto result_mean = gb.transform_named("mean");
          check(result_mean.nrows() == df.nrows(), "transform mean same nrows");
          check(result_mean["B"].get_value_double(0) == 3.0, "row 0 (foo) B mean = 3");
          check(result_mean["B"].get_value_double(1) == 3.0, "row 1 (bar) B mean = 3");
      

.. _example-dataframegroupby-squeeze_result-39:

.. dropdown:: squeeze_result (pd_test_2_all.cpp:20697)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20687
      :emphasize-lines: 11

          std::cout << "  -- test_groupby_squeeze_single_col --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("key", std::vector<std::string>{"A", "A", "B", "B"});
          df.add_column("val", std::vector<numpy::float64>{1.0, 2.0, 3.0, 4.0});
      
          auto gb = df.groupby("key");
          auto gb_sel = gb.select({"val"});  // single col, not list
          pandas::DataFrame result = gb_sel.sum();
      
          auto squeezed = gb_sel.squeeze_result(result);
      
          // Should be a Series<float64>
          check(std::holds_alternative<pandas::Series<numpy::float64>>(squeezed), "is_float64_series");
      
          auto& s = std::get<pandas::Series<numpy::float64>>(squeezed);
          check(s.size() == 2, "size_2");
          check(s.name() == "val", "name_val");
          check(approx_eq(s[0], 3.0), "A_sum_3");
          check(approx_eq(s[1], 7.0), "B_sum_7");
      }

.. _example-dataframegroupby-column-40:

.. dropdown:: column (pd_test_1_all.cpp:22039)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 22029
      :emphasize-lines: 11

                  std::string a1 = result.iat<double>(1, col_a_idx) == -1.0 ? "ok" : "fail";
                  std::string a2 = result.iat<double>(2, col_a_idx) == 3.0 ? "ok" : "fail";
                  std::string a3 = result.iat<double>(3, col_a_idx) == 4.0 ? "ok" : "fail";
      
                  if (a0 != "ok" || a1 != "ok" || a2 != "ok" || a3 != "ok") {
                      passed = false;
                      error_msg = "Column A values incorrect: A[0]=" + a0 + ", A[1]=" + a1 +
                                  ", A[2]=" + a2 + ", A[3]=" + a3;
                  }
      
                  // Check B column (all should be original)
                  double b0 = result.iat<double>(0, col_b_idx);
                  if (b0 != 5.0) {
                      passed = false;
                      error_msg = "B[0] should be 5, got " + std::to_string(b0);
                  }
      
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_where_basic() : " << error_msg << std::endl;
                      throw std::runtime_error("pd_test_where_basic failed: " + error_msg);
                  }

.. _example-dataframegroupby-compute_agg-41:

.. dropdown:: compute_agg (pd_test_5_all.cpp:112204)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 112194
      :emphasize-lines: 11

          // Default signature is groupby(by, axis, level, as_index, sort, group_keys, observed, dropna).
          auto gb = df_in.groupby("k", 0, std::nullopt, /*as_index=*/true,
                                  /*sort=*/true, /*group_keys=*/true,
                                  /*observed=*/false, /*dropna=*/true);
          pandas::DataFrame df = gb.agg("sum");
          std::string actual = df.to_string();
      
          // Pandas oracle (verified by analysis1 H3 logic + compute_agg empty=0.0):
          // - "a" observed, sum=10
          // - "b" observed, sum=20
          // - "c" unobserved -> compute_agg(empty, "sum") -> 0
          // Plan 12 (Logic-C int widening) has landed: aggregate_column now
          // preserves int64 for integer inputs, so the oracle is int64 with
          // integer literal display (no .0 suffix).
          std::string expected =
              "    v\n"
              "k    \n"
              "a  10\n"
              "b  20\n"
              "c   0";
          check_case("groupby_agg_dispatch_7c3a91_case_41",

.. _example-dataframegroupby-dataframe-42:

.. dropdown:: dataframe (pd_test_2_all.cpp:11742)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11732
      :emphasize-lines: 11

                      std::cout << "  [FAIL] : wrong dimensions" << std::endl;
                      std::remove(temp_path.c_str());
                      throw std::runtime_error("pd_test_to_hdf_mixed_types failed");
                  }
      
                  std::remove(temp_path.c_str());
                  std::cout << " -> tests passed" << std::endl;
              }
      
              void pd_test_to_hdf_empty_dataframe() {
                  std::cout << "========= to_hdf empty dataframe (real HDF5) ===================";
      
                  pandas::DataFrame df;
                  std::string temp_path = "temp/test_hdf5_empty.h5";
                  df.to_hdf(temp_path, "df", "w");
      
                  // Just verify file was created
                  std::ifstream file(temp_path);
                  if (!file.is_open()) {
                      std::cout << "  [FAIL] : file not created" << std::endl;
                      throw std::runtime_error("pd_test_to_hdf_empty_dataframe failed");

.. _example-dataframegroupby-filter-43:

.. dropdown:: filter (pd_test_3_all.cpp:2805)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 2795
      :emphasize-lines: 11

              threw = true;
          }
          if (!threw) {
              throw std::runtime_error("bool_() should throw for multi-element DataFrame");
          }
      
          std::cout << " -> tests passed" << std::endl;
      }
      
      void pd_test_3_all_df_filter() {
          std::cout << "========= DataFrame.filter() =============================";
      
          std::map<std::string, std::vector<double>> data = {
              {"col_a", {1.0, 2.0, 3.0}},
              {"col_b", {4.0, 5.0, 6.0}},
              {"other", {7.0, 8.0, 9.0}}
          };
          pandas::DataFrame df(data);
      
          // Test filter by items
          pandas::DataFrame filtered_items = df.filter({"col_a", "col_b"});

.. _example-dataframegroupby-filter_by_group_mask-44:

.. dropdown:: filter_by_group_mask (pd_test_3_all.cpp:27422)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27412
      :emphasize-lines: 11

          std::map<std::string, bool> mask;
          for (const auto& key : gb.group_keys_order()) {
              auto sub = gb.get_group(key);
              double sum = 0;
              for (size_t r = 0; r < sub.nrows(); ++r) {
                  sum += sub["B"].get_value_double(r);
              }
              mask[key] = (sum > 5);
          }
      
          auto result = gb.filter_by_group_mask(mask, true);
          check(result.nrows() == 5, "all rows pass filter (both groups sum > 5)");
      
          std::map<std::string, bool> mask3;
          mask3["bar"] = false;
          mask3["foo"] = true;
          auto result3 = gb.filter_by_group_mask(mask3, true);
          check(result3.nrows() == 3, "only foo rows kept (3 rows)");
      }
      
      void pd_test_gb_filter_preserves_order() {

.. _example-dataframegroupby-group_keys_order-45:

.. dropdown:: group_keys_order (pd_test_3_all.cpp:23393)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 23383
      :emphasize-lines: 11

      
          pandas::Series<numpy::float64> s({10.0, 20.0, 30.0, 40.0});
          std::vector<std::vector<std::string>> level_values = {
              {"a", "a", "b", "b"}, {"x", "y", "x", "y"}
          };
          std::vector<std::optional<std::string>> level_names = {"first", "second"};
          auto mi = pandas::MultiIndex::from_arrays<std::string>(level_values, level_names);
          s.set_multiindex(mi);
      
          auto gb = s.groupby_by_level(static_cast<size_t>(0), true);
          if (gb.group_keys_order().size() != 2)
              throw std::runtime_error("expected 2 groups");
          auto sums = gb.sum();
          if (sums[0] != 30.0 || sums[1] != 70.0)
              throw std::runtime_error("sum mismatch");
          if (!gb.get_index_name().has_value() || *gb.get_index_name() != "first")
              throw std::runtime_error("index name mismatch");
      
          std::cout << " -> tests passed" << std::endl;
      }
      

.. _example-dataframegroupby-groups-46:

.. dropdown:: groups (pd_test_2_all.cpp:20864)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20854
      :emphasize-lines: 11

      // =====================================================================
      // Per-group expanding tests
      // =====================================================================
      
      void test_series_groupby_expanding_sum() {
          std::cout << "  -- test_series_groupby_expanding_sum --" << std::endl;
      
          // Two groups: A=[1,2,3], B=[10,20]
          std::vector<numpy::float64> vals = {1.0, 10.0, 2.0, 20.0, 3.0};
          pandas::Series<numpy::float64> data(vals);
          pandas::Series<std::string> groups({"A", "B", "A", "B", "A"});
      
          auto sgb = data.groupby(groups);
          pandas::SeriesGroupByExpandingWindow ew(sgb, 1);
          auto result = ew.sum();
      
          check(result.size() == 5, "size_5");
          // A group: expanding sum = 1, 3, 6
          // B group: expanding sum = 10, 30
          // Original order: [A:1, B:10, A:3, B:30, A:6]
          check(approx_eq(result[0], 1.0), "A_exp_sum_0");

.. _example-dataframegroupby-list_selected-47:

.. dropdown:: list_selected (pd_test_5_all.cpp:28524)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 28514
      :emphasize-lines: 11

      }
      
      void case_1_squeeze_flag_state_machine(int& local_fail) {
          std::cout << "-- H1 squeeze flag state machine\n";
          auto df = make_df_std();
          auto gb0 = df.groupby("key");
      
          // (a) Base gb -> no selection -> squeeze false.
          pandas_tests::check(!gb0.should_squeeze_to_series(),
                              "H1.a.base_no_select_squeeze_false", local_fail);
          pandas_tests::check(!gb0.list_selected(),
                              "H1.a.base_list_selected_false", local_fail);
          check_eq("H1.a.base_selected_size_zero", 0,
                   (long long)gb0.selected_columns().size(), local_fail);
      
          // (b) select({c}) -> squeeze true.
          auto gb1 = gb0.select({"v_int"});
          pandas_tests::check(gb1.should_squeeze_to_series(),
                              "H1.b.select_single_squeeze_true", local_fail);
          pandas_tests::check(!gb1.list_selected(),
                              "H1.b.select_list_selected_false", local_fail);

.. _example-dataframegroupby-ngroups-48:

.. dropdown:: ngroups (pd_test_1_all.cpp:11497)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 11487
      :emphasize-lines: 11

                  // Create DataFrame with category column
                  std::map<std::string, std::vector<double>> data = {
                      {"category", {1.0, 1.0, 2.0, 2.0, 2.0}},
                      {"value", {10.0, 20.0, 30.0, 40.0, 50.0}}
                  };
                  pandas::DataFrame df(data);
      
                  // Test groupby
                  auto grouped = df.groupby("category");
      
                  bool passed = grouped.ngroups() == 2;
                  if (!passed) {
                      std::cout << "  [FAIL] : in pd_test_groupby_basic() : ngroups should be 2" << std::endl;
                      throw std::runtime_error("pd_test_groupby_basic failed: ngroups should be 2");
                  }
      
                  std::cout << " -> tests passed" << std::endl;
              }
      
              void pd_test_groupby_multiple_columns() {
                  std::cout << "========= GroupBy multiple columns ==============";

.. _example-dataframegroupby-nth-49:

.. dropdown:: nth (pd_test_3_all.cpp:27491)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27481
      :emphasize-lines: 11

          check(result_cumsum["B"].get_value_double(1) == 2.0, "row 1 (bar) cumsum B = 2");
          check(result_cumsum["B"].get_value_double(3) == 6.0, "row 3 (bar) cumsum B = 6");
      }
      
      void pd_test_gb_nth_basic() {
          std::cout << "  -- pd_test_gb_nth_basic --" << std::endl;
      
          auto df = make_test_df();
          auto gb = df.groupby("A");
      
          auto result = gb.nth(0);
          check(result.nrows() == 2, "nth(0) returns 2 rows (one per group)");
      
          auto result_last = gb.nth(-1);
          check(result_last.nrows() == 2, "nth(-1) returns 2 rows");
      
          auto result_multi = gb.nth(std::vector<int>{0, -1});
          check(result_multi.nrows() == 4, "nth([0,-1]) returns 4 rows");
      }
      
      void pd_test_gb_nth_slice() {

.. _example-dataframegroupby-nth-50:

.. dropdown:: nth (pd_test_3_all.cpp:27491)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27481
      :emphasize-lines: 11

          check(result_cumsum["B"].get_value_double(1) == 2.0, "row 1 (bar) cumsum B = 2");
          check(result_cumsum["B"].get_value_double(3) == 6.0, "row 3 (bar) cumsum B = 6");
      }
      
      void pd_test_gb_nth_basic() {
          std::cout << "  -- pd_test_gb_nth_basic --" << std::endl;
      
          auto df = make_test_df();
          auto gb = df.groupby("A");
      
          auto result = gb.nth(0);
          check(result.nrows() == 2, "nth(0) returns 2 rows (one per group)");
      
          auto result_last = gb.nth(-1);
          check(result_last.nrows() == 2, "nth(-1) returns 2 rows");
      
          auto result_multi = gb.nth(std::vector<int>{0, -1});
          check(result_multi.nrows() == 4, "nth([0,-1]) returns 4 rows");
      }
      
      void pd_test_gb_nth_slice() {

.. _example-dataframegroupby-select-51:

.. dropdown:: select (pd_test_2_all.cpp:20694)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20684
      :emphasize-lines: 11

      // =====================================================================
      
      void test_groupby_squeeze_single_col() {
          std::cout << "  -- test_groupby_squeeze_single_col --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("key", std::vector<std::string>{"A", "A", "B", "B"});
          df.add_column("val", std::vector<numpy::float64>{1.0, 2.0, 3.0, 4.0});
      
          auto gb = df.groupby("key");
          auto gb_sel = gb.select({"val"});  // single col, not list
          pandas::DataFrame result = gb_sel.sum();
      
          auto squeezed = gb_sel.squeeze_result(result);
      
          // Should be a Series<float64>
          check(std::holds_alternative<pandas::Series<numpy::float64>>(squeezed), "is_float64_series");
      
          auto& s = std::get<pandas::Series<numpy::float64>>(squeezed);
          check(s.size() == 2, "size_2");
          check(s.name() == "val", "name_val");

.. _example-dataframegroupby-select_as_list-52:

.. dropdown:: select_as_list (pd_test_2_all.cpp:20751)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 20741
      :emphasize-lines: 11

      }
      
      void test_groupby_no_squeeze_list_key() {
          std::cout << "  -- test_groupby_no_squeeze_list_key --" << std::endl;
      
          pandas::DataFrame df;
          df.add_column("key", std::vector<std::string>{"A", "A", "B", "B"});
          df.add_column("val", std::vector<numpy::float64>{1.0, 2.0, 3.0, 4.0});
      
          auto gb = df.groupby("key");
          auto gb_sel = gb.select_as_list({"val"});  // list selection -> no squeeze
          pandas::DataFrame result = gb_sel.sum();
      
          auto squeezed = gb_sel.squeeze_result(result);
          check(std::holds_alternative<std::monostate>(squeezed), "is_monostate_list_sel");
      }
      
      // =====================================================================
      // apply_result_index tests (MultiIndex reconstruction)
      // =====================================================================
      

.. _example-dataframegroupby-select_rows_by_indices-53:

.. dropdown:: select_rows_by_indices (pd_test_3_all.cpp:27515)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 27505
      :emphasize-lines: 11

          auto gb = df.groupby("A");
      
          std::vector<size_t> selected;
          for (const auto& key : gb.group_keys_order()) {
              const auto& indices = gb.groups().at(key);
              for (size_t i = 0; i < std::min(size_t(2), indices.size()); ++i) {
                  selected.push_back(indices[i]);
              }
          }
      
          auto result = gb.select_rows_by_indices(selected);
          check(result.nrows() == 4, "slice [0:2] returns 4 rows");
      }
      
      void pd_test_gb_nth_dropna() {
          std::cout << "  -- pd_test_gb_nth_dropna --" << std::endl;
      
          std::map<std::string, std::vector<double>> data;
          data["B"] = {std::numeric_limits<double>::quiet_NaN(), 2.0, 3.0, 4.0, 5.0};
          data["C"] = {10.0, 20.0, 30.0, 40.0, 50.0};
          pandas::DataFrame df(data);

.. _example-dataframegroupby-selected_columns-54:

.. dropdown:: selected_columns (pd_test_5_all.cpp:28527)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 28517
      :emphasize-lines: 11

          std::cout << "-- H1 squeeze flag state machine\n";
          auto df = make_df_std();
          auto gb0 = df.groupby("key");
      
          // (a) Base gb -> no selection -> squeeze false.
          pandas_tests::check(!gb0.should_squeeze_to_series(),
                              "H1.a.base_no_select_squeeze_false", local_fail);
          pandas_tests::check(!gb0.list_selected(),
                              "H1.a.base_list_selected_false", local_fail);
          check_eq("H1.a.base_selected_size_zero", 0,
                   (long long)gb0.selected_columns().size(), local_fail);
      
          // (b) select({c}) -> squeeze true.
          auto gb1 = gb0.select({"v_int"});
          pandas_tests::check(gb1.should_squeeze_to_series(),
                              "H1.b.select_single_squeeze_true", local_fail);
          pandas_tests::check(!gb1.list_selected(),
                              "H1.b.select_list_selected_false", local_fail);
      
          // (c) select_as_list({c}) 1-col -> squeeze false (DataFrame-style).
          auto gb2 = gb0.select_as_list({"v_int"});

.. _example-dataframegroupby-should_squeeze_to_series-55:

.. dropdown:: should_squeeze_to_series (pd_test_5_all.cpp:28522)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 28512
      :emphasize-lines: 11

              std::vector<std::string>{"level_0", "level_1"});
          return df;
      }
      
      void case_1_squeeze_flag_state_machine(int& local_fail) {
          std::cout << "-- H1 squeeze flag state machine\n";
          auto df = make_df_std();
          auto gb0 = df.groupby("key");
      
          // (a) Base gb -> no selection -> squeeze false.
          pandas_tests::check(!gb0.should_squeeze_to_series(),
                              "H1.a.base_no_select_squeeze_false", local_fail);
          pandas_tests::check(!gb0.list_selected(),
                              "H1.a.base_list_selected_false", local_fail);
          check_eq("H1.a.base_selected_size_zero", 0,
                   (long long)gb0.selected_columns().size(), local_fail);
      
          // (b) select({c}) -> squeeze true.
          auto gb1 = gb0.select({"v_int"});
          pandas_tests::check(gb1.should_squeeze_to_series(),
                              "H1.b.select_single_squeeze_true", local_fail);

.. _example-dataframegroupby-size-56:

.. dropdown:: size (pd_test_1_all.cpp:22)
   :class-title: example-dropdown

   .. code-block:: cpp
      :linenos:
      :lineno-start: 12
      :emphasize-lines: 11

      #include "../pandas/pd_boolean_array.h"
      
      namespace dataframe_tests {
      
      namespace dataframe_tests_boolean_array {
          void pd_test_boolean_array_constructors() {
              std::cout << "========= BooleanArray: constructors ======================= ";
      
              // Default constructor
              pandas::BooleanArray arr1;
              if (arr1.size() != 0) {
                  std::cout << "  [FAIL] : in pd_test_boolean_array_constructors() : default constructor size != 0" << std::endl;
                  throw std::runtime_error("pd_test_boolean_array_constructors failed: default constructor size != 0");
              }
      
              // Initializer list constructor
              pandas::BooleanArray arr2({
                  std::optional<bool>(true),
                  std::optional<bool>(false),
                  std::nullopt,
                  std::optional<bool>(true)