CategoricalIndex ================ .. cpp:class:: pandas::CategoricalIndex Index class for axis labels in pandas data structures. Example ------- .. code-block:: cpp #include using namespace pandas; // Create CategoricalIndex CategoricalIndex idx({1, 2, 3}, "my_index"); size_t len = idx.size(); Constructors ------------ .. list-table:: :widths: 55 25 20 :header-rows: 1 * - Signature - Location - Example * - ``explicit CategoricalIndex(const CategoricalArray& array, const std::optional& name = std::nullopt)`` - pd_categorical_index.h:62 - :ref:`View ` * - ``explicit CategoricalIndex(CategoricalArray&& array, const std::optional& name = std::nullopt)`` - pd_categorical_index.h:72 - :ref:`View ` * - ``explicit CategoricalIndex(const std::vector>& values, const std::optional& name = std::nullopt, bool ordered = false)`` - pd_categorical_index.h:83 - :ref:`View ` * - ``CategoricalIndex(const std::vector>& values, const std::vector& categories, bool ordered = false, const std::optional& name = std::nullopt)`` - pd_categorical_index.h:96 - :ref:`View ` * - ``CategoricalIndex(const CategoricalIndex& other)`` - pd_categorical_index.h:106 - :ref:`View ` * - ``CategoricalIndex(CategoricalIndex&& other) noexcept`` - pd_categorical_index.h:113 - :ref:`View ` Construction ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``static CategoricalIndex from_codes( const std::vector& codes, const std::vector& categories, bool ordered = false, const std::optional& name = std::nullopt)`` - static CategoricalIndex - pd_categorical_index.h:153 - :ref:`View ` Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``get_indexer_non_unique(const CategoricalIndex& target) const`` - - pd_categorical_index.h:945 - :ref:`View ` * - ``CategoricalIndex get_level_values(int level) const`` - CategoricalIndex - pd_categorical_index.h:884 - :ref:`View ` * - ``CategoricalIndex get_level_values(const std::string& level_name) const`` - CategoricalIndex - pd_categorical_index.h:900 - :ref:`View ` * - ``size_t get_slice_bound(const std::string& label, const std::string& side = "left") const`` - size_t - pd_categorical_index.h:1003 - :ref:`View ` * - ``std::string get_value_str(size_t index) const override`` - std::string - pd_categorical_index.h:587 - :ref:`View ` * - ``CategoricalIndex where(const numpy::NDArray& cond, const std::string& other) const`` - CategoricalIndex - pd_categorical_index.h:1761 - :ref:`View ` Data Manipulation ----------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalIndex droplevel(int level = 0) const`` - CategoricalIndex - pd_categorical_index.h:868 - :ref:`View ` * - ``std::pair> reindex( const CategoricalIndex& target, const std::string& method = "") const`` - std::pair> - pd_categorical_index.h:1431 - :ref:`View ` * - ``CategoricalIndex rename(const std::optional& new_name) const`` - CategoricalIndex - pd_categorical_index.h:482 - :ref:`View ` * - ``CategoricalIndex rename_categories(const std::vector& new_names) const`` - CategoricalIndex - pd_categorical_index.h:262 - :ref:`View ` * - ``CategoricalIndex rename_categories( const std::unordered_map& mapping) const`` - CategoricalIndex - pd_categorical_index.h:276 - :ref:`View ` * - ``CategoricalIndex set_names(const std::optional& new_name) const`` - CategoricalIndex - pd_categorical_index.h:1557 - :ref:`View ` Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::optional max() const`` - std::optional - pd_categorical_index.h:414 - :ref:`View ` * - ``std::optional min() const`` - std::optional - pd_categorical_index.h:399 - :ref:`View ` Aggregation ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::unordered_map> groupby( const std::vector& values) const`` - std::unordered_map> - pd_categorical_index.h:1040 - :ref:`View ` * - ``CategoricalIndex map(const std::unordered_map& mapping) const`` - CategoricalIndex - pd_categorical_index.h:460 - :ref:`View ` Arithmetic ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalIndex add_categories(const std::vector& new_cats) const`` - CategoricalIndex - pd_categorical_index.h:308 - :ref:`View ` Comparison ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool equals(const CategoricalIndex& other) const`` - bool - pd_categorical_index.h:611 - :ref:`View ` Sorting ------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``numpy::NDArray argsort(bool ascending = true) const`` - numpy::NDArray - pd_categorical_index.h:1793 - :ref:`View ` * - ``size_t searchsorted(const std::string& value, const std::string& side = "left") const`` - size_t - pd_categorical_index.h:1518 - :ref:`View ` * - ``CategoricalIndex sort_values(bool ascending = true) const`` - CategoricalIndex - pd_categorical_index.h:1854 - :ref:`View ` Reshaping --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalIndex transpose() const`` - CategoricalIndex - pd_categorical_index.h:1741 - :ref:`View ` Combining --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``join(const CategoricalIndex& other, const std::string& how = "left", bool sort = false) const`` - - pd_categorical_index.h:1182 - :ref:`View ` Time Series ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::optional asof(const std::string& where) const`` - std::optional - pd_categorical_index.h:683 - :ref:`View ` * - ``numpy::NDArray asof_locs(const std::vector& where) const`` - numpy::NDArray - pd_categorical_index.h:745 - :ref:`View ` * - ``numpy::NDArray diff(int64_t periods = 1) const`` - numpy::NDArray - pd_categorical_index.h:824 - :ref:`View ` * - ``CategoricalIndex shift(int64_t periods) const`` - CategoricalIndex - pd_categorical_index.h:1569 - :ref:`View ` I/O --- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalIndex to_flat_index() const`` - CategoricalIndex - pd_categorical_index.h:1669 - :ref:`View ` * - ``std::vector> to_list() const`` - std::vector> - pd_categorical_index.h:1704 - :ref:`View ` * - ``std::vector to_numpy(bool copy = true, const std::string& na_value = "") const`` - std::vector - pd_categorical_index.h:1678 - :ref:`View ` * - ``std::string to_string() const override`` - std::string - pd_categorical_index.h:517 - :ref:`View ` * - ``std::vector> tolist() const`` - std::vector> - pd_categorical_index.h:1700 - :ref:`View ` Conversion ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::vector astype(const std::string& dtype = "object") const`` - std::vector - pd_categorical_index.h:781 - :ref:`View ` * - ``CategoricalIndex copy() const`` - CategoricalIndex - pd_categorical_index.h:473 - :ref:`View ` * - ``CategoricalIndex infer_objects(bool copy_data = true) const`` - CategoricalIndex - pd_categorical_index.h:1139 - :ref:`View ` * - ``std::vector> view() const`` - std::vector> - pd_categorical_index.h:1750 - :ref:`View ` Type Checking ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool is_(const CategoricalIndex& other) const`` - bool - pd_categorical_index.h:1149 - :ref:`View ` * - ``bool is_boolean() const`` - bool - pd_categorical_index.h:1071 - :ref:`View ` * - ``bool is_categorical() const`` - bool - pd_categorical_index.h:1083 - :ref:`View ` * - ``bool is_floating() const`` - bool - pd_categorical_index.h:1091 - :ref:`View ` * - ``bool is_integer() const`` - bool - pd_categorical_index.h:1099 - :ref:`View ` * - ``bool is_interval() const`` - bool - pd_categorical_index.h:1107 - :ref:`View ` * - ``bool is_numeric() const`` - bool - pd_categorical_index.h:1115 - :ref:`View ` * - ``bool is_object() const`` - bool - pd_categorical_index.h:1123 - :ref:`View ` Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool all() const`` - bool - pd_categorical_index.h:645 - :ref:`View ` * - ``bool any() const`` - bool - pd_categorical_index.h:661 - :ref:`View ` * - ``std::optional argmax() const`` - std::optional - pd_categorical_index.h:440 - :ref:`View ` * - ``std::optional argmin() const`` - std::optional - pd_categorical_index.h:427 - :ref:`View ` * - ``CategoricalIndex as_ordered() const`` - CategoricalIndex - pd_categorical_index.h:369 - :ref:`View ` * - ``CategoricalIndex as_unordered() const`` - CategoricalIndex - pd_categorical_index.h:382 - :ref:`View ` * - ``const std::vector& categories() const`` - const std::vector& - pd_categorical_index.h:202 - :ref:`View ` * - ``std::unique_ptr clone() const override`` - std::unique_ptr - pd_categorical_index.h:502 - :ref:`View ` * - ``const numpy::NDArray& codes() const`` - const numpy::NDArray& - pd_categorical_index.h:214 - :ref:`View ` * - ``std::vector format(const std::string& formatter = "") const`` - std::vector - pd_categorical_index.h:920 - :ref:`View ` * - ``bool has_category(const std::string& category) const`` - bool - pd_categorical_index.h:243 - :ref:`View ` * - ``bool holds_integer() const`` - bool - pd_categorical_index.h:1063 - :ref:`View ` * - ``bool identical(const CategoricalIndex& other) const`` - bool - pd_categorical_index.h:630 - :ref:`View ` * - ``std::string inferred_type() const override`` - std::string - pd_categorical_index.h:494 - :ref:`View ` * - ``std::string item() const`` - std::string - pd_categorical_index.h:1158 - :ref:`View ` * - ``size_t memory_usage(bool deep = false) const`` - size_t - pd_categorical_index.h:1354 - :ref:`View ` * - ``size_t num_categories() const`` - size_t - pd_categorical_index.h:234 - :ref:`View ` * - ``bool ordered() const`` - bool - pd_categorical_index.h:226 - :ref:`View ` * - ``CategoricalIndex joined_index(joined_values, merged_cats, ordered(), this->name())`` - CategoricalIndex joined_index(joined_values, merged_cats, - pd_categorical_index.h:1320 - :ref:`View ` * - ``CategoricalIndex putmask(const numpy::NDArray& mask, const std::string& value) const`` - CategoricalIndex - pd_categorical_index.h:1386 - :ref:`View ` * - ``std::vector> ravel() const`` - std::vector> - pd_categorical_index.h:1414 - :ref:`View ` * - ``CategoricalIndex remove_categories(const std::vector& removals) const`` - CategoricalIndex - pd_categorical_index.h:322 - :ref:`View ` * - ``CategoricalIndex remove_unused_categories() const`` - CategoricalIndex - pd_categorical_index.h:334 - :ref:`View ` * - ``CategoricalIndex reorder_categories(const std::vector& new_order) const`` - CategoricalIndex - pd_categorical_index.h:293 - :ref:`View ` * - ``CategoricalIndex repeat(size_t repeats) const`` - CategoricalIndex - pd_categorical_index.h:1480 - :ref:`View ` * - ``std::string repr() const override`` - std::string - pd_categorical_index.h:578 - :ref:`View ` * - ``CategoricalIndex round(int decimals = 0) const`` - CategoricalIndex - pd_categorical_index.h:1504 - :ref:`View ` * - ``CategoricalIndex set_categories(const std::vector& new_cats, bool rename = false) const`` - CategoricalIndex - pd_categorical_index.h:350 - :ref:`View ` * - ``std::pair slice_indexer( const std::optional& start, const std::optional& stop) const`` - std::pair - pd_categorical_index.h:1604 - :ref:`View ` * - ``std::pair slice_locs( const std::optional& start = std::nullopt, const std::optional& stop = std::nullopt) const`` - std::pair - pd_categorical_index.h:1627 - :ref:`View ` * - ``CategoricalIndex sort(bool ascending = true) const`` - CategoricalIndex - pd_categorical_index.h:1873 - :ref:`View ` * - ``std::pair> sortlevel( int level = 0, bool ascending = true) const`` - std::pair> - pd_categorical_index.h:1640 - :ref:`View ` * - ``IndexTypeId type_id() const override`` - IndexTypeId - pd_categorical_index.h:506 - :ref:`View ` Internal Methods ---------------- *2 internal methods (prefixed with underscore)* Code Examples ------------- The following examples are extracted from the test suite. .. _example-categoricalindex-categoricalindex-0: .. dropdown:: CategoricalIndex (pd_test_2_all.cpp:20850) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20840 :emphasize-lines: 11 auto sgb = data.groupby(by); sgb.set_categorical_categories({"A", "B", "C"}); sgb.set_index_name("cat_key"); pandas::Series result(values); std::vector idx_labels = {"A", "B"}; result.set_index(std::make_unique>(idx_labels)); sgb.apply_result_index(result); // Should have CategoricalIndex (dtype_name() returns "category") check(result.index().dtype_name() == "category", "is_categorical_index"); } // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; .. _example-categoricalindex-categoricalindex-1: .. dropdown:: CategoricalIndex (pd_test_2_all.cpp:20850) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20840 :emphasize-lines: 11 auto sgb = data.groupby(by); sgb.set_categorical_categories({"A", "B", "C"}); sgb.set_index_name("cat_key"); pandas::Series result(values); std::vector idx_labels = {"A", "B"}; result.set_index(std::make_unique>(idx_labels)); sgb.apply_result_index(result); // Should have CategoricalIndex (dtype_name() returns "category") check(result.index().dtype_name() == "category", "is_categorical_index"); } // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; .. _example-categoricalindex-categoricalindex-2: .. dropdown:: CategoricalIndex (pd_test_2_all.cpp:20850) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20840 :emphasize-lines: 11 auto sgb = data.groupby(by); sgb.set_categorical_categories({"A", "B", "C"}); sgb.set_index_name("cat_key"); pandas::Series result(values); std::vector idx_labels = {"A", "B"}; result.set_index(std::make_unique>(idx_labels)); sgb.apply_result_index(result); // Should have CategoricalIndex (dtype_name() returns "category") check(result.index().dtype_name() == "category", "is_categorical_index"); } // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; .. _example-categoricalindex-categoricalindex-3: .. dropdown:: CategoricalIndex (pd_test_2_all.cpp:20850) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20840 :emphasize-lines: 11 auto sgb = data.groupby(by); sgb.set_categorical_categories({"A", "B", "C"}); sgb.set_index_name("cat_key"); pandas::Series result(values); std::vector idx_labels = {"A", "B"}; result.set_index(std::make_unique>(idx_labels)); sgb.apply_result_index(result); // Should have CategoricalIndex (dtype_name() returns "category") check(result.index().dtype_name() == "category", "is_categorical_index"); } // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; .. _example-categoricalindex-categoricalindex-4: .. dropdown:: CategoricalIndex (pd_test_2_all.cpp:20850) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20840 :emphasize-lines: 11 auto sgb = data.groupby(by); sgb.set_categorical_categories({"A", "B", "C"}); sgb.set_index_name("cat_key"); pandas::Series result(values); std::vector idx_labels = {"A", "B"}; result.set_index(std::make_unique>(idx_labels)); sgb.apply_result_index(result); // Should have CategoricalIndex (dtype_name() returns "category") check(result.index().dtype_name() == "category", "is_categorical_index"); } // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; .. _example-categoricalindex-categoricalindex-5: .. dropdown:: CategoricalIndex (pd_test_2_all.cpp:20850) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20840 :emphasize-lines: 11 auto sgb = data.groupby(by); sgb.set_categorical_categories({"A", "B", "C"}); sgb.set_index_name("cat_key"); pandas::Series result(values); std::vector idx_labels = {"A", "B"}; result.set_index(std::make_unique>(idx_labels)); sgb.apply_result_index(result); // Should have CategoricalIndex (dtype_name() returns "category") check(result.index().dtype_name() == "category", "is_categorical_index"); } // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; .. _example-categoricalindex-from_codes-6: .. dropdown:: from_codes (pd_test_1_all.cpp:403) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 393 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_from_codes() { std::cout << "========= CategoricalArray: from_codes ======================= "; std::vector cats = {"a", "b", "c"}; std::vector codes = {0, 1, 2, 0, 1, -1}; // -1 is NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, false); if (arr.size() != 6) { std::cout << " [FAIL] : in pd_test_categorical_array_from_codes() : size != 6" << std::endl; throw std::runtime_error("pd_test_categorical_array_from_codes failed: size != 6"); } // Check that code=-1 creates NA if (!arr.is_na(5)) { std::cout << " [FAIL] : in pd_test_categorical_array_from_codes() : code -1 should be NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_from_codes failed: code -1 should be NA"); .. _example-categoricalindex-get_indexer_non_unique-7: .. dropdown:: get_indexer_non_unique (pd_test_3_all.cpp:739) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 729 :emphasize-lines: 11 if (indexer.getElementAt({1}) != 3) { std::cout << " [FAIL] : in pd_test_3_all_index_indexers() : 'd' should be at index 3" << std::endl; throw std::runtime_error("pd_test_3_all_index_indexers failed: 'd' index"); } // "f" doesn't exist -> -1 if (indexer.getElementAt({2}) != -1) { std::cout << " [FAIL] : in pd_test_3_all_index_indexers() : 'f' should be -1" << std::endl; throw std::runtime_error("pd_test_3_all_index_indexers failed: 'f' index"); } // Test get_indexer_non_unique() std::vector target2 = {"a", "c", "z"}; // "z" doesn't exist pandas::Index target_idx(target2); auto [indexer2, missing] = idx.get_indexer_non_unique(target_idx); if (indexer2.getSize() < 2) { std::cout << " [FAIL] : in pd_test_3_all_index_indexers() : get_indexer_non_unique size too small" << std::endl; throw std::runtime_error("pd_test_3_all_index_indexers failed: get_indexer_non_unique size"); } // Test slice_indexer() .. _example-categoricalindex-get_level_values-8: .. dropdown:: get_level_values (pd_test_3_all.cpp:4524) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4514 :emphasize-lines: 11 } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_interval_index_get_level_values_droplevel() { std::cout << "========= IntervalIndex.get_level_values/droplevel() "; pandas::IntervalIndex64 idx = pandas::IntervalIndex64::from_breaks({0, 10, 20, 30}); // get_level_values(0) should work pandas::IntervalIndex64 level_vals = idx.get_level_values(0); if (level_vals.size() != idx.size()) { throw std::runtime_error("get_level_values(0) size mismatch"); } // get_level_values(1) should throw bool threw = false; try { idx.get_level_values(1); } catch (const std::out_of_range&) { .. _example-categoricalindex-get_level_values-9: .. dropdown:: get_level_values (pd_test_3_all.cpp:4524) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4514 :emphasize-lines: 11 } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_interval_index_get_level_values_droplevel() { std::cout << "========= IntervalIndex.get_level_values/droplevel() "; pandas::IntervalIndex64 idx = pandas::IntervalIndex64::from_breaks({0, 10, 20, 30}); // get_level_values(0) should work pandas::IntervalIndex64 level_vals = idx.get_level_values(0); if (level_vals.size() != idx.size()) { throw std::runtime_error("get_level_values(0) size mismatch"); } // get_level_values(1) should throw bool threw = false; try { idx.get_level_values(1); } catch (const std::out_of_range&) { .. _example-categoricalindex-get_slice_bound-10: .. dropdown:: get_slice_bound (pd_test_3_all.cpp:3644) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3634 :emphasize-lines: 11 formatted = idx.format(custom_formatter); if (formatted[0] != "val:1") { throw std::runtime_error("custom formatter failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_index_get_slice_bound() { std::cout << "========= Index.get_slice_bound() =================="; pandas::Index idx({10, 20, 30, 40, 50}); // Exact match, left side size_t bound = idx.get_slice_bound(30, "left"); if (bound != 2) { throw std::runtime_error("left bound for 30 should be 2"); } // Exact match, right side .. _example-categoricalindex-get_value_str-11: .. dropdown:: get_value_str (pd_test_1_all.cpp:4665) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4655 :emphasize-lines: 11 auto corr_df = df.corr(); // Check dimensions bool passed = corr_df.nrows() == 2 && corr_df.ncols() == 2; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_dataframe_corr() : corr should be 2x2" << std::endl; throw std::runtime_error("pd_test_aggregation_dataframe_corr failed: corr should be 2x2"); } // Diagonal should be 1.0 std::string aa = corr_df["A"].get_value_str(0); passed = std::abs(std::stod(aa) - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_dataframe_corr() : diagonal should be 1.0" << std::endl; throw std::runtime_error("pd_test_aggregation_dataframe_corr failed: diagonal should be 1.0"); } // A-B correlation should be 1.0 (perfect correlation) std::string ab = corr_df["B"].get_value_str(0); passed = std::abs(std::stod(ab) - 1.0) < 0.001; if (!passed) { .. _example-categoricalindex-where-12: .. dropdown:: where (pd_test_1_all.cpp:22018) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 22008 :emphasize-lines: 11 data["B"] = {5.0, 6.0, 7.0, 8.0}; pandas::DataFrame df(data); // Create condition DataFrame (values > 2) std::map> cond_data; cond_data["A"] = {false, false, true, true}; // 1<=2, 2<=2, 3>2, 4>2 cond_data["B"] = {true, true, true, true}; // all >2 pandas::DataFrame cond(cond_data); // Apply where with replacement value -1 pandas::DataFrame result = df.where(cond, -1.0); // Get column index for A - it's sorted alphabetically in std::map size_t col_a_idx = df.get_column_index("A"); size_t col_b_idx = df.get_column_index("B"); bool passed = true; std::string error_msg; // Check A column values std::string a0 = result.iat(0, col_a_idx) == -1.0 ? "ok" : "fail"; .. _example-categoricalindex-droplevel-13: .. dropdown:: droplevel (pd_test_1_all.cpp:14428) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 14418 :emphasize-lines: 11 void pd_test_multiindex_droplevel() { std::cout << "========= droplevel =================================== "; std::vector> arrays = { {"a", "a", "b"}, {"x", "y", "z"}, {"1", "2", "3"} }; pandas::MultiIndex mi = pandas::MultiIndex::from_arrays(arrays); pandas::MultiIndex dropped = mi.droplevel(1); bool passed = true; if (dropped.nlevels() != 2) { std::cout << " [FAIL] : nlevels should be 2 after drop" << std::endl; passed = false; } // Check remaining levels auto tup = dropped[0]; .. _example-categoricalindex-reindex-14: .. dropdown:: reindex (pd_test_1_all.cpp:6708) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6698 :emphasize-lines: 11 } } // Test reindex rows { std::map> data; data["A"] = {1.0, 2.0, 3.0}; pandas::DataFrame df(data); df = df.set_axis({"x", "y", "z"}, 0); auto reindexed = df.reindex({"x", "z", "w"}, 0); if (reindexed.nrows() != 3) { std::cout << " [FAIL] : in pd_test_dataframe_index_ops() : reindex wrong nrows" << std::endl; throw std::runtime_error("pd_test_dataframe_index_ops failed: reindex nrows"); } // 'w' should have NaN std::string val = reindexed["A"].get_value_str(2); if (!std::isnan(std::stod(val))) { std::cout << " [FAIL] : in pd_test_dataframe_index_ops() : missing label should be NaN" << std::endl; throw std::runtime_error("pd_test_dataframe_index_ops failed: reindex NaN"); } .. _example-categoricalindex-rename-15: .. dropdown:: rename (pd_test_1_all.cpp:5816) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5806 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_rename() { std::cout << "========= rename ======================================"; pandas::CategoricalArray arr({"x", "y"}); pandas::CategoricalIndex idx(arr, "old_name"); pandas::CategoricalIndex renamed = idx.rename("new_name"); bool passed = (renamed.name().has_value() && *renamed.name() == "new_name" && renamed.size() == idx.size() && renamed.categories() == idx.categories()); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_rename()" << std::endl; throw std::runtime_error("pd_test_categorical_index_rename failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-rename_categories-16: .. dropdown:: rename_categories (pd_test_1_all.cpp:655) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 645 :emphasize-lines: 11 void pd_test_categorical_array_rename_categories() { std::cout << "========= CategoricalArray: rename_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; // a, b, a pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Rename categories std::vector new_names = {"alpha", "beta"}; pandas::CategoricalArray result = arr.rename_categories(new_names); // Check categories are renamed const std::vector& result_cats = result.categories(); if (result_cats[0] != "alpha" || result_cats[1] != "beta") { std::cout << " [FAIL] : in pd_test_categorical_array_rename_categories() : categories not renamed" << std::endl; throw std::runtime_error("pd_test_categorical_array_rename_categories failed: categories not renamed"); } // Values should now be renamed std::optional val = result[0]; .. _example-categoricalindex-rename_categories-17: .. dropdown:: rename_categories (pd_test_1_all.cpp:655) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 645 :emphasize-lines: 11 void pd_test_categorical_array_rename_categories() { std::cout << "========= CategoricalArray: rename_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; // a, b, a pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Rename categories std::vector new_names = {"alpha", "beta"}; pandas::CategoricalArray result = arr.rename_categories(new_names); // Check categories are renamed const std::vector& result_cats = result.categories(); if (result_cats[0] != "alpha" || result_cats[1] != "beta") { std::cout << " [FAIL] : in pd_test_categorical_array_rename_categories() : categories not renamed" << std::endl; throw std::runtime_error("pd_test_categorical_array_rename_categories failed: categories not renamed"); } // Values should now be renamed std::optional val = result[0]; .. _example-categoricalindex-set_names-18: .. dropdown:: set_names (pd_test_1_all.cpp:14519) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 14509 :emphasize-lines: 11 std::cout << "-> tests passed" << std::endl; } void pd_test_multiindex_set_names() { std::cout << "========= set_names =================================== "; std::vector> arrays = {{"a", "b"}, {"x", "y"}}; pandas::MultiIndex mi = pandas::MultiIndex::from_arrays(arrays); std::vector> new_names = {"level_a", "level_b"}; pandas::MultiIndex named = mi.set_names(new_names); bool passed = (named.names()[0] == "level_a" && named.names()[1] == "level_b"); if (!passed) { std::cout << " [FAIL] : names not set correctly" << std::endl; throw std::runtime_error("pd_test_multiindex_set_names failed"); } std::cout << "-> tests passed" << std::endl; } .. _example-categoricalindex-max-19: .. dropdown:: max (pd_test_1_all.cpp:771) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 761 :emphasize-lines: 11 pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); .. _example-categoricalindex-min-20: .. dropdown:: min (pd_test_1_all.cpp:764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 754 :emphasize-lines: 11 } void pd_test_categorical_array_ordered_operations() { std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= "; std::vector cats = {"low", "medium", "high"}; std::vector codes = {0, 2, 1, 0, -1}; // low, high, medium, low, NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); .. _example-categoricalindex-groupby-21: .. dropdown:: groupby (pd_test_1_all.cpp:11495) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11485 :emphasize-lines: 11 std::cout << "========= GroupBy basic ========================="; // Create DataFrame with category column std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0, 50.0}} }; pandas::DataFrame df(data); // Test groupby auto grouped = df.groupby("category"); bool passed = grouped.ngroups() == 2; if (!passed) { std::cout << " [FAIL] : in pd_test_groupby_basic() : ngroups should be 2" << std::endl; throw std::runtime_error("pd_test_groupby_basic failed: ngroups should be 2"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-map-22: .. dropdown:: map (pd_test_1_all.cpp:5839) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5829 :emphasize-lines: 11 // Map Tests // ============================================================================ void pd_test_categorical_index_map() { std::cout << "========= map ========================================="; pandas::CategoricalArray arr({"yes", "no", "yes"}); pandas::CategoricalIndex idx(arr); std::unordered_map mapping = {{"yes", "1"}, {"no", "0"}}; pandas::CategoricalIndex mapped = idx.map(mapping); bool passed = (mapped.has_category("1") && mapped.has_category("0") && !mapped.has_category("yes") && !mapped.has_category("no")); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_map()" << std::endl; throw std::runtime_error("pd_test_categorical_index_map failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-add_categories-23: .. dropdown:: add_categories (pd_test_1_all.cpp:555) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 545 :emphasize-lines: 11 } void pd_test_categorical_array_add_categories() { std::cout << "========= CategoricalArray: add_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Add new categories pandas::CategoricalArray result = arr.add_categories({"c", "d"}); if (result.categories().size() != 4) { std::cout << " [FAIL] : in pd_test_categorical_array_add_categories() : new categories size != 4" << std::endl; throw std::runtime_error("pd_test_categorical_array_add_categories failed: new categories size != 4"); } // Original values should be preserved std::optional val = result[0]; if (!val.has_value() || *val != "a") { std::cout << " [FAIL] : in pd_test_categorical_array_add_categories() : value not preserved" << std::endl; throw std::runtime_error("pd_test_categorical_array_add_categories failed: value not preserved"); .. _example-categoricalindex-equals-24: .. dropdown:: equals (pd_test_1_all.cpp:5866) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5856 :emphasize-lines: 11 std::cout << "========= equals ======================================"; pandas::CategoricalArray arr1({"a", "b", "a"}); pandas::CategoricalArray arr2({"a", "b", "a"}); pandas::CategoricalArray arr3({"a", "b", "c"}); pandas::CategoricalIndex idx1(arr1); pandas::CategoricalIndex idx2(arr2); pandas::CategoricalIndex idx3(arr3); bool passed = (idx1.equals(idx2) && !idx1.equals(idx3)); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_equals()" << std::endl; throw std::runtime_error("pd_test_categorical_index_equals failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_identical() { std::cout << "========= identical ==================================="; .. _example-categoricalindex-argsort-25: .. dropdown:: argsort (pd_test_1_all.cpp:1304) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1294 :emphasize-lines: 11 std::cout << "========= DatetimeArray: sorting ======================= "; pandas::DatetimeArray arr(std::vector{ "2023-06-15", "NaT", "2023-01-01", "2023-12-31" }); // argsort ascending auto indices = arr.argsort(true, "last"); // Expected order: 2023-01-01(2), 2023-06-15(0), 2023-12-31(3), NaT(1) if (indices.getElementAt({0}) != 2) { std::cout << " [FAIL] : argsort: first should be index 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argsort first"); } if (indices.getElementAt({3}) != 1) { std::cout << " [FAIL] : argsort: last should be index 1 (NaT)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: NaT position"); } .. _example-categoricalindex-searchsorted-26: .. dropdown:: searchsorted (pd_test_1_all.cpp:18958) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 18948 :emphasize-lines: 11 // ========================================================================= // Search Tests // ========================================================================= void pd_test_range_index_searchsorted() { std::cout << "========= searchsorted ================================ "; pandas::RangeIndex ri(0, 10, 2); // [0, 2, 4, 6, 8] bool passed = (ri.searchsorted(4, "left") == 2 && ri.searchsorted(4, "right") == 3 && ri.searchsorted(3, "left") == 2 && // 3 would go between 2 and 4 ri.searchsorted(-1, "left") == 0 && // Before all ri.searchsorted(10, "left") == 5); // After all if (!passed) { std::cout << " [FAIL] : searchsorted" << std::endl; throw std::runtime_error("pd_test_range_index_searchsorted failed"); } .. _example-categoricalindex-sort_values-27: .. dropdown:: sort_values (pd_test_1_all.cpp:6408) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6398 :emphasize-lines: 11 void pd_test_dataframe_sorting() { std::cout << "========= sorting =========================="; std::map> data; data["A"] = {3.0, 1.0, 4.0, 1.0, 5.0}; data["B"] = {9.0, 2.0, 6.0, 5.0, 3.0}; pandas::DataFrame df(data); // Test sort_values ascending auto sorted_asc = df.sort_values("A", true); // First value should be smallest (1.0) std::string first_val = sorted_asc["A"].get_value_str(0); if (std::stod(first_val) != 1.0) { std::cout << " [FAIL] : in pd_test_dataframe_sorting() : sort_values asc first != 1" << std::endl; throw std::runtime_error("pd_test_dataframe_sorting failed: sort_values asc first != 1"); } // Test sort_values descending auto sorted_desc = df.sort_values("A", false); first_val = sorted_desc["A"].get_value_str(0); .. _example-categoricalindex-transpose-28: .. dropdown:: transpose (pd_test_1_all.cpp:16648) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16638 :emphasize-lines: 11 std::cout << " [FAIL] : in pd_test_ndframe_transpose() : T_() size" << std::endl; throw std::runtime_error("pd_test_ndframe_transpose failed: T_() size"); } passed = transposed[0] == 1 && transposed[1] == 2 && transposed[2] == 3; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_transpose() : T_() values" << std::endl; throw std::runtime_error("pd_test_ndframe_transpose failed: T_() values"); } // Test transpose() alias auto transposed2 = s.transpose(); passed = transposed2.size() == s.size(); if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_transpose() : transpose() size" << std::endl; throw std::runtime_error("pd_test_ndframe_transpose failed: transpose() size"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-join-29: .. dropdown:: join (pd_test_1_all.cpp:12353) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12343 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_index_join() { std::cout << "========= join ========================================"; pandas::Index idx1{1, 2, 3}; pandas::Index idx2{2, 3, 4}; auto [inner_joined, left_idx, right_idx] = idx1.join(idx2, "inner"); bool passed = (inner_joined.size() == 2); // {2, 3} auto [outer_joined, ol_idx, or_idx] = idx1.join(idx2, "outer"); passed = passed && (outer_joined.size() == 4); // {1, 2, 3, 4} if (!passed) { std::cout << " [FAIL] : in pd_test_index_join() : join failed" << std::endl; throw std::runtime_error("pd_test_index_join failed"); } .. _example-categoricalindex-asof-30: .. dropdown:: asof (pd_test_2_all.cpp:366) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 356 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_add_prefix test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_add_prefix.cpp (end) ----------------------------- // ------------------- pd_test_asof.cpp (start) ----------------------------- // dataframe_tests/pd_test_asof.cpp // Test for DataFrame.asof() method #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives namespace dataframe_tests { .. _example-categoricalindex-asof_locs-31: .. dropdown:: asof_locs (pd_test_3_all.cpp:3557) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3547 :emphasize-lines: 11 throw std::runtime_error("all() should be true for empty index"); } if (empty_idx.any()) { throw std::runtime_error("any() should be false for empty index"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_index_asof() { std::cout << "========= Index.asof()/asof_locs() ================="; // Test with monotonically increasing index pandas::Index idx({10, 20, 30, 40, 50}); // Exact match auto result = idx.asof(30); if (!result.has_value() || result.value() != 30) { throw std::runtime_error("asof() exact match should return 30"); } .. _example-categoricalindex-diff-32: .. dropdown:: diff (pd_test_1_all.cpp:5171) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5161 :emphasize-lines: 11 } void pd_test_arithmetic_dataframe_diff_shift() { std::cout << "========= DataFrame diff/shift =================="; std::map> data; data["A"] = {1.0, 3.0, 6.0, 10.0}; pandas::DataFrame df(data); // diff: [NaN, 2, 3, 4] auto d = df.diff(); std::string val = d["A"].get_value_str(1); bool passed = std::abs(std::stod(val) - 2.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_arithmetic_dataframe_diff_shift() : diff failed" << std::endl; throw std::runtime_error("pd_test_arithmetic_dataframe_diff_shift failed: diff failed"); } // First element should be NaN val = d["A"].get_value_str(0); passed = std::isnan(std::stod(val)); .. _example-categoricalindex-shift-33: .. dropdown:: shift (pd_test_1_all.cpp:5188) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5178 :emphasize-lines: 11 // First element should be NaN val = d["A"].get_value_str(0); passed = std::isnan(std::stod(val)); if (!passed) { std::cout << " [FAIL] : in pd_test_arithmetic_dataframe_diff_shift() : diff NaN failed" << std::endl; throw std::runtime_error("pd_test_arithmetic_dataframe_diff_shift failed: diff NaN failed"); } // shift: [NaN, 1, 3, 6] auto s = df.shift(); val = s["A"].get_value_str(1); passed = std::abs(std::stod(val) - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_arithmetic_dataframe_diff_shift() : shift failed" << std::endl; throw std::runtime_error("pd_test_arithmetic_dataframe_diff_shift failed: shift failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-to_flat_index-34: .. dropdown:: to_flat_index (pd_test_1_all.cpp:14733) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 14723 :emphasize-lines: 11 void pd_test_multiindex_to_flat_index() { std::cout << "========= to_flat_index =============================== "; std::vector> arrays = { {"a", "b"}, {"x", "y"} }; pandas::MultiIndex mi = pandas::MultiIndex::from_arrays(arrays); auto flat = mi.to_flat_index(); bool passed = (flat.size() == 2 && flat[0][0] == "a" && flat[0][1] == "x" && flat[1][0] == "b" && flat[1][1] == "y"); if (!passed) { std::cout << " [FAIL] : to_flat_index incorrect" << std::endl; throw std::runtime_error("pd_test_multiindex_to_flat_index failed"); } .. _example-categoricalindex-to_list-35: .. dropdown:: to_list (pd_test_1_all.cpp:10247) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10237 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_to_list() { std::cout << "========= to_list ========================="; pandas::CategoricalArray arr({"x", "y", "z"}); pandas::CategoricalIndex idx(arr); auto list = idx.to_list(); bool passed = (list.size() == 3 && list[0].has_value() && *list[0] == "x" && list[1].has_value() && *list[1] == "y" && list[2].has_value() && *list[2] == "z"); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_to_list() : to_list check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_to_list failed"); } .. _example-categoricalindex-to_numpy-36: .. dropdown:: to_numpy (pd_test_1_all.cpp:16764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16754 :emphasize-lines: 11 // ===================================================================== // to_numpy Tests // ===================================================================== void pd_test_ndframe_to_numpy() { std::cout << "========= to_numpy =============================================" << std::endl; pandas::Series s({10, 20, 30}); auto arr = s.to_numpy(); bool passed = arr.getSize() == 3; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_to_numpy() : size" << std::endl; throw std::runtime_error("pd_test_ndframe_to_numpy failed: size"); } passed = arr.getElementAt({0}) == 10 && arr.getElementAt({1}) == 20 && arr.getElementAt({2}) == 30; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_to_numpy() : values" << std::endl; .. _example-categoricalindex-to_string-37: .. dropdown:: to_string (pd_test_1_all.cpp:2693) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2683 :emphasize-lines: 11 pandas::PeriodArray arr_m(std::vector{ "2020-01", "NaT", "2025-06" }, "M"); // Year auto years = arr_m.year(); auto y0 = years[0]; if (!y0.has_value() || y0.value() != 2020) { std::cout << " [FAIL] : year[0] should be 2020, got " << (y0.has_value() ? std::to_string(y0.value()) : "NA") << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[0]"); } auto y1 = years[1]; if (y1.has_value()) { std::cout << " [FAIL] : year[1] should be NA (NaT)" << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[1] should be NA"); } auto y2 = years[2]; .. _example-categoricalindex-tolist-38: .. dropdown:: tolist (pd_test_3_all.cpp:2300) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2290 :emphasize-lines: 11 threw = true; } if (!threw) { throw std::runtime_error("swapaxes should throw for invalid axes"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_to_list() { std::cout << "========= CategoricalArray.to_list()/tolist() ========="; std::vector> values = {"a", "b", std::nullopt, "c"}; pandas::CategoricalArray arr(values); auto list = arr.to_list(); if (list.size() != 4 || *list[0] != "a" || *list[1] != "b" || list[2].has_value() || *list[3] != "c") { throw std::runtime_error("to_list failed"); } .. _example-categoricalindex-astype-39: .. dropdown:: astype (pd_test_1_all.cpp:21292) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 21282 :emphasize-lines: 11 std::cout << "========= astype all columns to float64 ============="; // Create DataFrame with int64 columns std::map> data; data["A"] = {1, 2, 3, 4, 5}; data["B"] = {10, 20, 30, 40, 50}; pandas::DataFrame df(data); // Convert all columns to float64 pandas::DataFrame df_float = df.astype("float64"); // Verify dtype changed pandas::Series dtypes = df_float.dtypes(); bool passed = true; if (dtypes[static_cast(0)] != "float64") { std::cout << " [FAIL] : in pd_test_astype_all_columns_to_float64() : column A dtype is " << dtypes[static_cast(0)] << ", expected float64" << std::endl; passed = false; } if (dtypes[static_cast(1)] != "float64") { .. _example-categoricalindex-copy-40: .. dropdown:: copy (pd_test_1_all.cpp:5798) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5788 :emphasize-lines: 11 // ============================================================================ // Copy/Rename Tests // ============================================================================ void pd_test_categorical_index_copy() { std::cout << "========= copy ========================================"; pandas::CategoricalArray arr({"a", "b", "c"}); pandas::CategoricalIndex idx(arr, "original"); pandas::CategoricalIndex copied = idx.copy(); bool passed = (copied.size() == idx.size() && copied.name() == idx.name() && copied.categories() == idx.categories() && copied.ordered() == idx.ordered()); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_copy()" << std::endl; throw std::runtime_error("pd_test_categorical_index_copy failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-infer_objects-41: .. dropdown:: infer_objects (pd_test_1_all.cpp:27595) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 27585 :emphasize-lines: 11 // Create DataFrame with string column containing integers std::map> data; data["A"] = {"1", "2", "3", "4", "5"}; pandas::DataFrame df(data); // Before inference, dtype should be string/object std::string before_dtype = df["A"].dtype_name(); // Apply infer_objects pandas::DataFrame result = df.infer_objects(); // After inference, dtype should be int64 std::string after_dtype = result["A"].dtype_name(); bool passed = (after_dtype == "int64"); if (!passed) { std::cout << " [FAIL] : in pd_test_infer_objects_integer_column() : expected int64, got " << after_dtype << std::endl; throw std::runtime_error("pd_test_infer_objects_integer_column failed"); } .. _example-categoricalindex-view-42: .. dropdown:: view (pd_test_3_all.cpp:2147) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2137 :emphasize-lines: 11 throw std::runtime_error("memory_usage shallow too small"); } if (deep < shallow) { throw std::runtime_error("memory_usage deep should be >= shallow"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_ravel_view() { std::cout << "========= CategoricalArray.ravel()/view() ============="; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray arr(values); auto raveled = arr.ravel(); if (raveled.size() != 3 || !raveled.equals(arr)) { throw std::runtime_error("ravel failed"); } auto viewed = arr.view(); .. _example-categoricalindex-is_-43: .. dropdown:: is_ (pd_test_3_all.cpp:3972) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3962 :emphasize-lines: 11 // For typed Index, this is a no-op if (result.size() != 5) { throw std::runtime_error("infer_objects size should be 5"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_index_is_() { std::cout << "========= Index.is_() =============================="; pandas::Index idx1({1, 2, 3, 4, 5}); pandas::Index idx2({1, 2, 3, 4, 5}); // Different object // Different objects should not be the same if (idx1.is_(idx2)) { throw std::runtime_error("different objects should not be is_() equal"); } // Same object should be the same .. _example-categoricalindex-is_boolean-44: .. dropdown:: is_boolean (pd_test_3_all.cpp:3290) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3280 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_datetime_index_type_checks() { std::cout << "========= DatetimeIndex type checks ======================"; pandas::DatetimeIndex idx = pandas::date_range("2024-01-01", "2024-01-05", std::nullopt, "D"); // Type check methods if (idx.is_boolean()) { throw std::runtime_error("is_boolean() should be false"); } if (idx.is_categorical()) { throw std::runtime_error("is_categorical() should be false"); } if (idx.is_floating()) { throw std::runtime_error("is_floating() should be false"); } if (idx.is_integer()) { throw std::runtime_error("is_integer() should be false"); .. _example-categoricalindex-is_categorical-45: .. dropdown:: is_categorical (pd_test_3_all.cpp:3293) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3283 :emphasize-lines: 11 void pd_test_3_all_datetime_index_type_checks() { std::cout << "========= DatetimeIndex type checks ======================"; pandas::DatetimeIndex idx = pandas::date_range("2024-01-01", "2024-01-05", std::nullopt, "D"); // Type check methods if (idx.is_boolean()) { throw std::runtime_error("is_boolean() should be false"); } if (idx.is_categorical()) { throw std::runtime_error("is_categorical() should be false"); } if (idx.is_floating()) { throw std::runtime_error("is_floating() should be false"); } if (idx.is_integer()) { throw std::runtime_error("is_integer() should be false"); } if (idx.is_interval()) { throw std::runtime_error("is_interval() should be false"); .. _example-categoricalindex-is_floating-46: .. dropdown:: is_floating (pd_test_3_all.cpp:622) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 612 :emphasize-lines: 11 // Test with integer index pandas::IndexDtype int_dtype; if (!int_dtype.is_numeric()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be numeric" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_numeric"); } if (!int_dtype.is_integer()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be integer" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_integer"); } if (int_dtype.is_floating()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be floating" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_floating"); } if (int_dtype.is_object()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be object" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_object"); } // Test with float index pandas::IndexDtype float_dtype; .. _example-categoricalindex-is_integer-47: .. dropdown:: is_integer (pd_test_3_all.cpp:618) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 608 :emphasize-lines: 11 void pd_test_3_all_index_dtype_checks() { std::cout << "========= IndexDtype.is_numeric/integer/floating/object() "; // Test with integer index pandas::IndexDtype int_dtype; if (!int_dtype.is_numeric()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be numeric" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_numeric"); } if (!int_dtype.is_integer()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be integer" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_integer"); } if (int_dtype.is_floating()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be floating" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_floating"); } if (int_dtype.is_object()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be object" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_object"); .. _example-categoricalindex-is_interval-48: .. dropdown:: is_interval (pd_test_3_all.cpp:3302) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3292 :emphasize-lines: 11 } if (idx.is_categorical()) { throw std::runtime_error("is_categorical() should be false"); } if (idx.is_floating()) { throw std::runtime_error("is_floating() should be false"); } if (idx.is_integer()) { throw std::runtime_error("is_integer() should be false"); } if (idx.is_interval()) { throw std::runtime_error("is_interval() should be false"); } if (idx.is_numeric()) { throw std::runtime_error("is_numeric() should be false"); } if (idx.is_object()) { throw std::runtime_error("is_object() should be false"); } if (idx.holds_integer()) { throw std::runtime_error("holds_integer() should be false"); .. _example-categoricalindex-is_numeric-49: .. dropdown:: is_numeric (pd_test_3_all.cpp:614) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 604 :emphasize-lines: 11 // ============================================================================ // Category 4: Index Type Checking (IndexDtype) // ============================================================================ void pd_test_3_all_index_dtype_checks() { std::cout << "========= IndexDtype.is_numeric/integer/floating/object() "; // Test with integer index pandas::IndexDtype int_dtype; if (!int_dtype.is_numeric()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be numeric" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_numeric"); } if (!int_dtype.is_integer()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be integer" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_integer"); } if (int_dtype.is_floating()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be floating" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_floating"); .. _example-categoricalindex-is_object-50: .. dropdown:: is_object (pd_test_3_all.cpp:626) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 616 :emphasize-lines: 11 throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_numeric"); } if (!int_dtype.is_integer()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should be integer" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_integer"); } if (int_dtype.is_floating()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be floating" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_floating"); } if (int_dtype.is_object()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : int should not be object" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: int is_object"); } // Test with float index pandas::IndexDtype float_dtype; if (!float_dtype.is_numeric()) { std::cout << " [FAIL] : in pd_test_3_all_index_dtype_checks() : float should be numeric" << std::endl; throw std::runtime_error("pd_test_3_all_index_dtype_checks failed: float is_numeric"); } .. _example-categoricalindex-all-51: .. dropdown:: all (pd_test_1_all.cpp:247) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 237 :emphasize-lines: 11 pandas::BooleanArray has_true({ std::optional(false), std::optional(true) }); any_result = has_true.any(); if (!any_result.has_value() || !any_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : any() with True" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: any() with True"); } // Test all() pandas::BooleanArray all_true({ std::optional(true), std::optional(true) }); auto all_result = all_true.all(); if (!all_result.has_value() || !all_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : all() of all True" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: all() all True"); } .. _example-categoricalindex-any-52: .. dropdown:: any (pd_test_1_all.cpp:226) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 216 :emphasize-lines: 11 std::cout << " [FAIL] : in pd_test_boolean_array_kleene_not() : ~NA should be NA" << std::endl; throw std::runtime_error("pd_test_boolean_array_kleene_not failed: ~NA"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_reductions() { std::cout << "========= BooleanArray: reductions ======================= "; // Test any() pandas::BooleanArray all_false({ std::optional(false), std::optional(false) }); auto any_result = all_false.any(); if (!any_result.has_value() || any_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : any() of all False" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: any() all False"); } .. _example-categoricalindex-argmax-53: .. dropdown:: argmax (pd_test_1_all.cpp:1323) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1313 :emphasize-lines: 11 } // argmin auto min_idx = arr.argmin(); if (!min_idx.has_value() || min_idx.value() != 2) { std::cout << " [FAIL] : argmin should be 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmin"); } // argmax auto max_idx = arr.argmax(); if (!max_idx.has_value() || max_idx.value() != 3) { std::cout << " [FAIL] : argmax should be 3 (2023-12-31)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmax"); } std::cout << " -> tests passed" << std::endl; } void pd_test_datetime_array_unique() { std::cout << "========= DatetimeArray: unique/factorize ======================= "; .. _example-categoricalindex-argmin-54: .. dropdown:: argmin (pd_test_1_all.cpp:1316) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1306 :emphasize-lines: 11 if (indices.getElementAt({0}) != 2) { std::cout << " [FAIL] : argsort: first should be index 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argsort first"); } if (indices.getElementAt({3}) != 1) { std::cout << " [FAIL] : argsort: last should be index 1 (NaT)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: NaT position"); } // argmin auto min_idx = arr.argmin(); if (!min_idx.has_value() || min_idx.value() != 2) { std::cout << " [FAIL] : argmin should be 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmin"); } // argmax auto max_idx = arr.argmax(); if (!max_idx.has_value() || max_idx.value() != 3) { std::cout << " [FAIL] : argmax should be 3 (2023-12-31)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmax"); .. _example-categoricalindex-as_ordered-55: .. dropdown:: as_ordered (pd_test_1_all.cpp:791) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 781 :emphasize-lines: 11 unordered.min(); } catch (const std::exception&) { threw = true; } if (!threw) { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : unordered min should throw" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: unordered min should throw"); } // Test as_ordered / as_unordered pandas::CategoricalArray reordered = unordered.as_ordered(); if (!reordered.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : as_ordered failed" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: as_ordered failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_comparisons() { std::cout << "========= CategoricalArray: comparisons ======================= "; .. _example-categoricalindex-as_unordered-56: .. dropdown:: as_unordered (pd_test_1_all.cpp:778) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 768 :emphasize-lines: 11 } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); } catch (const std::exception&) { threw = true; } if (!threw) { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : unordered min should throw" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: unordered min should throw"); } .. _example-categoricalindex-categories-57: .. dropdown:: categories (pd_test_1_all.cpp:389) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 379 :emphasize-lines: 11 std::vector> vals = { std::optional("low"), std::optional("high"), std::optional("medium") }; pandas::CategoricalArray arr3(vals, cats, true); // ordered if (!arr3.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : should be ordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: should be ordered"); } if (arr3.categories().size() != 3) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : categories size != 3" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: categories size != 3"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_from_codes() { std::cout << "========= CategoricalArray: from_codes ======================= "; .. _example-categoricalindex-clone-58: .. dropdown:: clone (pd_test_1_all.cpp:5776) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5766 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_clone() { std::cout << "========= clone ======================================="; pandas::CategoricalArray arr({"p", "q", "r"}); pandas::CategoricalIndex idx(arr, "original"); std::unique_ptr cloned = idx.clone(); bool passed = (cloned != nullptr && cloned->size() == idx.size() && cloned->name() == idx.name()); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_clone()" << std::endl; throw std::runtime_error("pd_test_categorical_index_clone failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-codes-59: .. dropdown:: codes (pd_test_1_all.cpp:473) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 463 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_codes_property() { std::cout << "========= CategoricalArray: codes property ======================= "; std::vector cats = {"x", "y", "z"}; std::vector codes = {0, 1, 2, 1, 0}; pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); numpy::NDArray arr_codes = arr.codes(); if (arr_codes.getSize() != 5) { std::cout << " [FAIL] : in pd_test_categorical_array_codes_property() : codes size != 5" << std::endl; throw std::runtime_error("pd_test_categorical_array_codes_property failed: codes size != 5"); } // Check codes match for (size_t i = 0; i < codes.size(); ++i) { if (arr_codes.getElementAt({i}) != codes[i]) { std::cout << " [FAIL] : in pd_test_categorical_array_codes_property() : code mismatch at " << i << std::endl; throw std::runtime_error("pd_test_categorical_array_codes_property failed: code mismatch"); .. _example-categoricalindex-format-60: .. dropdown:: format (main.cpp:20) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10 :emphasize-lines: 11 int main() { // Automatically log all output to temp/pd_test_output.log numpy::TestLogger logger("temp/pd_test_output.log"); int res = 0; int res1 = 0; std::string resS = ""; // call all the tests res1 = dataframe_tests::pd_test_main(); resS += std::format(" pd_test_main: {} errors\n", res1); res += res1; std::cout << "\n------------------------- main --------------------------------------------\n"; std::cout << std::endl << "All tests completed. Nb errors = " << res << std::endl; std::cout << "Details: \n" << resS; std::cout << "\n---------------------------------------------------------------------------\n"; return res; } .. _example-categoricalindex-has_category-61: .. dropdown:: has_category (pd_test_1_all.cpp:5303) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5293 :emphasize-lines: 11 } void pd_test_categorical_index_values_with_categories_constructor() { std::cout << "========= values with categories constructor =========="; std::vector> values = {"a", "b", "a"}; std::vector categories = {"a", "b", "c", "d"}; pandas::CategoricalIndex idx(values, categories, true, "explicit_cats"); bool passed = (idx.size() == 3 && idx.num_categories() == 4 && idx.ordered() && idx.has_category("c") && idx.has_category("d")); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_values_with_categories_constructor()" << std::endl; throw std::runtime_error("pd_test_categorical_index_values_with_categories_constructor failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_copy_constructor() { std::cout << "========= copy constructor ============================"; .. _example-categoricalindex-holds_integer-62: .. dropdown:: holds_integer (pd_test_3_all.cpp:3311) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3301 :emphasize-lines: 11 } if (idx.is_interval()) { throw std::runtime_error("is_interval() should be false"); } if (idx.is_numeric()) { throw std::runtime_error("is_numeric() should be false"); } if (idx.is_object()) { throw std::runtime_error("is_object() should be false"); } if (idx.holds_integer()) { throw std::runtime_error("holds_integer() should be false"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_datetime_index_sort() { std::cout << "========= DatetimeIndex.sort_values() ===================="; pandas::DatetimeIndex idx = pandas::date_range("2024-01-01", "2024-01-05", std::nullopt, "D"); .. _example-categoricalindex-identical-63: .. dropdown:: identical (pd_test_1_all.cpp:5883) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5873 :emphasize-lines: 11 } void pd_test_categorical_index_identical() { std::cout << "========= identical ==================================="; pandas::CategoricalArray arr({"a", "b"}); pandas::CategoricalIndex idx1(arr, "same_name"); pandas::CategoricalIndex idx2(arr, "same_name"); pandas::CategoricalIndex idx3(arr, "diff_name"); bool passed = (idx1.identical(idx2) && !idx1.identical(idx3)); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_identical()" << std::endl; throw std::runtime_error("pd_test_categorical_index_identical failed"); } std::cout << " -> tests passed" << std::endl; } // ============================================================================ // Inherited Operations Tests .. _example-categoricalindex-inferred_type-64: .. dropdown:: inferred_type (pd_test_1_all.cpp:5270) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5260 :emphasize-lines: 11 } void pd_test_categorical_index_array_constructor() { std::cout << "========= array constructor ==========================="; pandas::CategoricalArray arr({"apple", "banana", "apple", "cherry"}); pandas::CategoricalIndex idx(arr, "fruits"); bool passed = (idx.size() == 4 && !idx.empty() && idx.name().has_value() && *idx.name() == "fruits" && idx.inferred_type() == "categorical"); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_array_constructor()" << std::endl; throw std::runtime_error("pd_test_categorical_index_array_constructor failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_values_constructor() { std::cout << "========= values constructor =========================="; .. _example-categoricalindex-item-65: .. dropdown:: item (pd_test_3_all.cpp:3712) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3702 :emphasize-lines: 11 // Test is_interval (always false for base Index) if (int_idx.is_interval()) { throw std::runtime_error("base Index should not be interval"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_index_item() { std::cout << "========= Index.item() ============================="; pandas::Index idx1({42}); numpy::int64 val = idx1.item(); if (val != 42) { throw std::runtime_error("item() should return 42"); } // Test error for size != 1 pandas::Index idx2({1, 2, 3}); .. _example-categoricalindex-memory_usage-66: .. dropdown:: memory_usage (pd_test_1_all.cpp:27063) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 27053 :emphasize-lines: 11 } std::cout << "====================================== [OK] pd_test_value_counts test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_value_counts.cpp (end) ----------------------------- // ------------------- pd_test_memory_usage.cpp (start) ----------------------------- // Tests for DataFrame.memory_usage() - pandas-compatible memory usage reporting namespace dataframe_tests { namespace dataframe_tests_memory_usage { void pd_test_memory_usage_basic() { std::cout << "========= basic memory_usage ======================="; // Create a simple DataFrame with multiple columns std::map> data; data["A"] = {1.0, 2.0, 3.0, 4.0, 5.0}; .. _example-categoricalindex-num_categories-67: .. dropdown:: num_categories (pd_test_1_all.cpp:5285) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5275 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_values_constructor() { std::cout << "========= values constructor =========================="; std::vector> values = {"a", "b", "a", std::nullopt, "c"}; pandas::CategoricalIndex idx(values, std::optional("letters"), false); bool passed = (idx.size() == 5 && idx.num_categories() == 3 && !idx.ordered() && idx.name().has_value() && *idx.name() == "letters"); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_values_constructor()" << std::endl; throw std::runtime_error("pd_test_categorical_index_values_constructor failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_values_with_categories_constructor() { .. _example-categoricalindex-ordered-68: .. dropdown:: ordered (pd_test_1_all.cpp:359) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 349 :emphasize-lines: 11 void pd_test_categorical_array_constructors() { std::cout << "========= CategoricalArray: constructors ======================= "; // Default constructor pandas::CategoricalArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: default constructor size != 0"); } if (arr1.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : default should be unordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: default should be unordered"); } // Constructor from values (infer categories) std::vector> values = { std::optional("a"), std::optional("b"), std::optional("a"), std::optional("c") .. _example-categoricalindex-ordered-69: .. dropdown:: ordered (pd_test_1_all.cpp:359) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 349 :emphasize-lines: 11 void pd_test_categorical_array_constructors() { std::cout << "========= CategoricalArray: constructors ======================= "; // Default constructor pandas::CategoricalArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: default constructor size != 0"); } if (arr1.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : default should be unordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: default should be unordered"); } // Constructor from values (infer categories) std::vector> values = { std::optional("a"), std::optional("b"), std::optional("a"), std::optional("c") .. _example-categoricalindex-putmask-70: .. dropdown:: putmask (pd_test_3_all.cpp:3752) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3742 :emphasize-lines: 11 // Should be at least sizeof index + 5 * sizeof(int64) if (usage < 5 * sizeof(numpy::int64)) { throw std::runtime_error("memory_usage too small"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_index_putmask() { std::cout << "========= Index.putmask() =========================="; pandas::Index idx({1, 2, 3, 4, 5}); numpy::NDArray mask(std::vector{5}); mask.setElementAt({0}, numpy::bool_(true)); mask.setElementAt({1}, numpy::bool_(false)); mask.setElementAt({2}, numpy::bool_(true)); mask.setElementAt({3}, numpy::bool_(false)); mask.setElementAt({4}, numpy::bool_(true)); auto result = idx.putmask(mask, numpy::int64(99)); .. _example-categoricalindex-ravel-71: .. dropdown:: ravel (pd_test_3_all.cpp:2147) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2137 :emphasize-lines: 11 throw std::runtime_error("memory_usage shallow too small"); } if (deep < shallow) { throw std::runtime_error("memory_usage deep should be >= shallow"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_ravel_view() { std::cout << "========= CategoricalArray.ravel()/view() ============="; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray arr(values); auto raveled = arr.ravel(); if (raveled.size() != 3 || !raveled.equals(arr)) { throw std::runtime_error("ravel failed"); } auto viewed = arr.view(); .. _example-categoricalindex-remove_categories-72: .. dropdown:: remove_categories (pd_test_1_all.cpp:591) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 581 :emphasize-lines: 11 } void pd_test_categorical_array_remove_categories() { std::cout << "========= CategoricalArray: remove_categories ======================= "; std::vector cats = {"a", "b", "c"}; std::vector codes = {0, 1, 2, 1}; // a, b, c, b pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Remove 'c' - values with 'c' become NA pandas::CategoricalArray result = arr.remove_categories({"c"}); if (result.categories().size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_remove_categories() : categories size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_remove_categories failed: categories size != 2"); } // Element at index 2 should now be NA (was 'c') if (!result.is_na(2)) { std::cout << " [FAIL] : in pd_test_categorical_array_remove_categories() : removed category should be NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_remove_categories failed: removed category should be NA"); .. _example-categoricalindex-remove_unused_categories-73: .. dropdown:: remove_unused_categories (pd_test_1_all.cpp:737) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 727 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_remove_unused_categories() { std::cout << "========= CategoricalArray: remove_unused_categories ======================= "; std::vector cats = {"a", "b", "c", "d"}; std::vector codes = {0, 0, 2}; // a, a, c (b and d unused) pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); pandas::CategoricalArray result = arr.remove_unused_categories(); // Only 'a' and 'c' should remain if (result.categories().size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_remove_unused_categories() : categories size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_remove_unused_categories failed: categories size != 2"); } // Values should be preserved std::optional val0 = result[0]; std::optional val2 = result[2]; .. _example-categoricalindex-reorder_categories-74: .. dropdown:: reorder_categories (pd_test_1_all.cpp:695) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 685 :emphasize-lines: 11 void pd_test_categorical_array_reorder_categories() { std::cout << "========= CategoricalArray: reorder_categories ======================= "; std::vector cats = {"a", "b", "c"}; std::vector codes = {0, 1, 2}; // a, b, c pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Reorder categories std::vector new_order = {"c", "b", "a"}; pandas::CategoricalArray result = arr.reorder_categories(new_order); // Check categories are reordered const std::vector& result_cats = result.categories(); if (result_cats[0] != "c" || result_cats[1] != "b" || result_cats[2] != "a") { std::cout << " [FAIL] : in pd_test_categorical_array_reorder_categories() : categories not reordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_reorder_categories failed: categories not reordered"); } // Values should be preserved std::optional val0 = result[0]; .. _example-categoricalindex-repeat-75: .. dropdown:: repeat (pd_test_3_all.cpp:2166) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2156 :emphasize-lines: 11 auto viewed = arr.view(); if (viewed.size() != 3 || !viewed.equals(arr)) { throw std::runtime_error("view failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_repeat() { std::cout << "========= CategoricalArray.repeat() ==================="; std::vector> values = {"a", "b"}; pandas::CategoricalArray arr(values); auto result = arr.repeat(3); if (result.size() != 6 || *result[0] != "a" || *result[2] != "a" || *result[3] != "b" || *result[5] != "b") { throw std::runtime_error("repeat scalar failed"); } .. _example-categoricalindex-repr-76: .. dropdown:: repr (pd_test_1_all.cpp:10906) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10896 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_repr() { std::cout << "========= repr ========================="; pandas::CategoricalArray arr({"a", "b", "c"}); // Use ExtensionIndex directly to test base class repr pandas::ExtensionIndex idx(arr, "test"); std::string repr_str = idx.repr(); bool passed = (!repr_str.empty() && repr_str.find("ExtensionIndex") != std::string::npos); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_repr() : repr check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_repr failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalindex-round-77: .. dropdown:: round (pd_test_1_all.cpp:1688) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1678 :emphasize-lines: 11 void pd_test_floating_array_rounding() { std::cout << "========= FloatingArray: rounding ======================= "; pandas::FloatingArray arr({ std::optional(1.234), std::optional(2.567), std::nullopt }); auto rounded = arr.round(2); if (std::abs(rounded[0].value() - 1.23) > 0.001 || std::abs(rounded[1].value() - 2.57) > 0.001) { std::cout << " [FAIL] : in pd_test_floating_array_rounding() : round(2)" << std::endl; throw std::runtime_error("pd_test_floating_array_rounding failed: round(2)"); } if (!rounded.is_na(2)) { std::cout << " [FAIL] : in pd_test_floating_array_rounding() : round should preserve NA" << std::endl; throw std::runtime_error("pd_test_floating_array_rounding failed: NA preservation"); } .. _example-categoricalindex-set_categories-78: .. dropdown:: set_categories (pd_test_1_all.cpp:623) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 613 :emphasize-lines: 11 void pd_test_categorical_array_set_categories() { std::cout << "========= CategoricalArray: set_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; // a, b, a pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Set new categories (values not in new categories become NA) std::vector new_cats = {"a", "c"}; // 'b' removed, 'c' added pandas::CategoricalArray result = arr.set_categories(new_cats); if (result.categories().size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_set_categories() : categories size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_set_categories failed: categories size != 2"); } // Element at index 1 should be NA (was 'b', now not in categories) if (!result.is_na(1)) { std::cout << " [FAIL] : in pd_test_categorical_array_set_categories() : 'b' value should be NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_set_categories failed: 'b' value should be NA"); .. _example-categoricalindex-slice_indexer-79: .. dropdown:: slice_indexer (pd_test_3_all.cpp:711) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 701 :emphasize-lines: 11 } std::cout << " -> tests passed" << std::endl; } // ============================================================================ // Category 6: Index Indexer Methods // ============================================================================ void pd_test_3_all_index_indexers() { std::cout << "========= Index.get_indexer_for/non_unique/slice_indexer() "; std::vector vals = {"a", "b", "c", "d", "e"}; pandas::Index idx(vals); // Test get_indexer_for() std::vector target = {"b", "d", "f"}; // "f" doesn't exist numpy::NDArray indexer = idx.get_indexer_for(target); if (indexer.getSize() != 3) { std::cout << " [FAIL] : in pd_test_3_all_index_indexers() : get_indexer_for size mismatch" << std::endl; throw std::runtime_error("pd_test_3_all_index_indexers failed: get_indexer_for size"); .. _example-categoricalindex-slice_locs-80: .. dropdown:: slice_locs (pd_test_1_all.cpp:18275) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 18265 :emphasize-lines: 11 } std::cout << "-> tests passed" << std::endl; } void pd_test_range_index_slice_locs() { std::cout << "========= slice_locs ================================== "; pandas::RangeIndex ri(0, 10); // [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] auto [start_idx, stop_idx] = ri.slice_locs(3, 7); bool passed = (start_idx == 3 && stop_idx == 8); if (!passed) { std::cout << " [FAIL] : slice_locs" << std::endl; throw std::runtime_error("pd_test_range_index_slice_locs failed"); } std::cout << "-> tests passed" << std::endl; } .. _example-categoricalindex-sort-81: .. dropdown:: sort (pd_test_3_all.cpp:3869) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3859 :emphasize-lines: 11 throw std::runtime_error("last 2 positions should be NaN"); } if (std::abs(result[0] - 3.0) > 0.001) { throw std::runtime_error("shift(-2) [0] should be 3.0"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_index_sort() { std::cout << "========= Index.sort() ============================="; pandas::Index idx({3, 1, 4, 1, 5, 9, 2, 6}); auto result = idx.sort(); if (result[0] != 1 || result[1] != 1 || result[7] != 9) { throw std::runtime_error("sort() not working correctly"); } // Test descending result = idx.sort(false); .. _example-categoricalindex-sortlevel-82: .. dropdown:: sortlevel (pd_test_1_all.cpp:14676) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 14666 :emphasize-lines: 11 void pd_test_multiindex_sortlevel() { std::cout << "========= sortlevel =================================== "; std::vector> arrays = { {"b", "a", "c"}, {"2", "1", "3"} }; pandas::MultiIndex mi = pandas::MultiIndex::from_arrays(arrays); auto [sorted, indices] = mi.sortlevel(0); bool passed = true; // After sorting by level 0: a, b, c if (sorted[0][0] != "a" || sorted[1][0] != "b" || sorted[2][0] != "c") { std::cout << " [FAIL] : not sorted correctly by level 0" << std::endl; passed = false; } if (!passed) { .. _example-categoricalindex-type_id-83: .. dropdown:: type_id (pd_test_3_all.cpp:25592) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 25582 :emphasize-lines: 11 // ------------------- pd_test_value_classify (end) ------------------ // ------------------- pd_test_index_type_id (start) ------------------ namespace dataframe_tests_index_type_id { void pd_test_index_type_id_dispatch() { std::cout << "========= IndexTypeId dispatch ======================="; // RangeIndex ::pandas::RangeIndex ri(0, 5); if (ri.type_id() != ::pandas::IndexTypeId::RangeIndex) throw std::runtime_error("RangeIndex type_id failed"); // Index ::pandas::Index si(std::vector{"a", "b", "c"}); if (si.type_id() != ::pandas::IndexTypeId::IndexString) throw std::runtime_error("Index type_id failed"); // Index ::pandas::Index ii(std::vector{1, 2, 3}); if (ii.type_id() != ::pandas::IndexTypeId::IndexInt64)