CategoricalArray ================ .. cpp:class:: pandas::CategoricalArray Extension array type for specialized data storage. Example ------- .. code-block:: cpp #include using namespace pandas; // Use CategoricalArray CategoricalArray obj; // ... operations ... Constructors ------------ .. list-table:: :widths: 55 25 20 :header-rows: 1 * - Signature - Location - Example * - ``CategoricalArray(const numpy::NDArray& codes, const std::vector& categories, bool ordered = false, bool copy = true)`` - pd_categorical_array.h:103 - :ref:`View ` * - ``CategoricalArray(const std::vector>& values, const std::optional>& categories = std::nullopt, std::optional ordered = std::nullopt, const std::optional& dtype = std::nullopt, bool fastpath = false, bool copy = true)`` - pd_categorical_array.h:126 - :ref:`View ` * - ``explicit CategoricalArray(const std::vector>& values, bool ordered)`` - pd_categorical_array.h:233 - :ref:`View ` * - ``CategoricalArray(const std::vector>& values, const std::vector& categories, bool ordered = false)`` - pd_categorical_array.h:251 - :ref:`View ` Construction ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``static CategoricalArray from_codes( const std::vector& codes, const std::vector& categories, bool ordered = false, const std::string& dtype = "", bool validate = true)`` - static CategoricalArray - pd_categorical_array.h:275 - :ref:`View ` * - ``static CategoricalArray from_sequence( const std::vector>& values, bool ordered = false)`` - static CategoricalArray - pd_categorical_array.h:304 - Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``numpy::int32 get_code(ssize_t index) const`` - numpy::int32 - pd_categorical_array.h:519 - * - ``CategoricalArray take(const std::vector& indices, std::optional axis = std::nullopt, bool allow_fill = false, const std::string& fill_value = "") const`` - CategoricalArray - pd_categorical_array.h:1305 - :ref:`View ` Data Manipulation ----------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalArray dropna() const`` - CategoricalArray - pd_categorical_array.h:595 - :ref:`View ` * - ``CategoricalArray insert(size_t loc, const std::optional& item) const`` - CategoricalArray - pd_categorical_array.h:1594 - :ref:`View ` * - ``CategoricalArray rename_categories(const std::vector& new_categories) const`` - CategoricalArray - pd_categorical_array.h:740 - :ref:`View ` * - ``CategoricalArray rename_categories(const std::unordered_map& mapping) const`` - CategoricalArray - pd_categorical_array.h:760 - :ref:`View ` Missing Data ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalArray fillna(const std::string& value, const std::string& method = "", std::optional limit = std::nullopt, bool copy = true) const`` - CategoricalArray - pd_categorical_array.h:566 - :ref:`View ` * - ``CategoricalArray interpolate( const std::string& method = "pad", int axis = 0, const std::optional>& index = std::nullopt, std::optional limit = std::nullopt, const std::string& limit_direction = "forward", const std::optional& limit_area = std::nullopt, bool copy = true ) const`` - CategoricalArray - pd_categorical_array.h:1643 - :ref:`View ` * - ``numpy::NDArray isna() const`` - numpy::NDArray - pd_categorical_array.h:539 - :ref:`View ` * - ``numpy::NDArray isnull() const`` - numpy::NDArray - pd_categorical_array.h:1798 - :ref:`View ` * - ``numpy::NDArray notna() const`` - numpy::NDArray - pd_categorical_array.h:550 - :ref:`View ` * - ``numpy::NDArray notnull() const`` - numpy::NDArray - pd_categorical_array.h:1884 - :ref:`View ` Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``size_t count() const`` - size_t - pd_categorical_array.h:609 - :ref:`View ` * - ``std::map describe() const`` - std::map - pd_categorical_array.h:1480 - :ref:`View ` * - ``std::optional max(bool skipna = true) const`` - std::optional - pd_categorical_array.h:918 - :ref:`View ` * - ``std::optional min(bool skipna = true) const`` - std::optional - pd_categorical_array.h:884 - :ref:`View ` * - ``std::pair, std::vector> value_counts() const`` - std::pair, std::vector> - pd_categorical_array.h:999 - :ref:`View ` Aggregation ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalArray map(const std::unordered_map& mapper, const std::string& na_action = "") const`` - CategoricalArray - pd_categorical_array.h:1808 - :ref:`View ` * - ``CategoricalArray map(Func func, const std::string& na_action = "") const`` - CategoricalArray - pd_categorical_array.h:1852 - :ref:`View ` Arithmetic ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalArray add_categories(const std::vector& new_categories) const`` - CategoricalArray - pd_categorical_array.h:638 - :ref:`View ` Comparison ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool equals(const CategoricalArray& other) const`` - bool - pd_categorical_array.h:1575 - :ref:`View ` * - ``size_t len() const`` - size_t - pd_categorical_array.h:447 - :ref:`View ` Sorting ------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``numpy::NDArray argsort(bool ascending = true, const std::string& na_position = "last", const std::string& kind = "quicksort") const`` - numpy::NDArray - pd_categorical_array.h:1180 - :ref:`View ` * - ``size_t searchsorted(const std::string& value, const std::string& side = "left", std::optional> sorter = std::nullopt) const`` - size_t - pd_categorical_array.h:1970 - :ref:`View ` * - ``CategoricalArray sort_values(bool ascending = true, const std::string& na_position = "last", bool inplace = false) const`` - CategoricalArray - pd_categorical_array.h:2067 - :ref:`View ` Reshaping --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalArray T() const`` - CategoricalArray - pd_categorical_array.h:2162 - :ref:`View ` * - ``CategoricalArray swapaxes(int axis1, int axis2) const`` - CategoricalArray - pd_categorical_array.h:2094 - :ref:`View ` * - ``CategoricalArray transpose() const`` - CategoricalArray - pd_categorical_array.h:2155 - :ref:`View ` Combining --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``static CategoricalArray concat(const std::vector& arrays)`` - static CategoricalArray - pd_categorical_array.h:314 - :ref:`View ` * - ``static CategoricalArray concat_merge(const std::vector& arrays)`` - static CategoricalArray - pd_categorical_array.h:351 - :ref:`View ` Time Series ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``CategoricalArray shift(int64_t periods = 1, const std::optional& fill_value = std::nullopt) const`` - CategoricalArray - pd_categorical_array.h:2031 - :ref:`View ` I/O --- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::vector> to_list() const`` - std::vector> - pd_categorical_array.h:2105 - :ref:`View ` * - ``numpy::NDArray to_numpy(bool copy = true, U na_value = U{-1}) const`` - numpy::NDArray - pd_categorical_array.h:2136 - :ref:`View ` * - ``std::string to_string() const`` - std::string - pd_categorical_array.h:2212 - :ref:`View ` * - ``std::vector> tolist() const`` - std::vector> - pd_categorical_array.h:2124 - :ref:`View ` Conversion ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::vector> astype(const std::string& dtype, bool copy = true) const`` - std::vector> - pd_categorical_array.h:1365 - :ref:`View ` * - ``numpy::NDArray astype_codes() const`` - numpy::NDArray - pd_categorical_array.h:1418 - :ref:`View ` * - ``CategoricalArray copy() const`` - CategoricalArray - pd_categorical_array.h:1298 - :ref:`View ` * - ``CategoricalArray view() const`` - CategoricalArray - pd_categorical_array.h:2174 - :ref:`View ` Set Operations -------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``numpy::NDArray duplicated(const std::string& keep = "first") const`` - numpy::NDArray - pd_categorical_array.h:1517 - :ref:`View ` * - ``numpy::NDArray isin(const std::vector& values) const`` - numpy::NDArray - pd_categorical_array.h:1770 - :ref:`View ` * - ``CategoricalArray unique() const`` - CategoricalArray - pd_categorical_array.h:951 - :ref:`View ` Type Checking ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool is_na(ssize_t index) const`` - bool - pd_categorical_array.h:527 - :ref:`View ` Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``std::optional argmax(std::optional axis = std::nullopt, bool skipna = true) const`` - std::optional - pd_categorical_array.h:1258 - :ref:`View ` * - ``std::optional argmin(std::optional axis = std::nullopt, bool skipna = true) const`` - std::optional - pd_categorical_array.h:1220 - :ref:`View ` * - ``CategoricalArray as_ordered() const`` - CategoricalArray - pd_categorical_array.h:865 - :ref:`View ` * - ``CategoricalArray as_unordered() const`` - CategoricalArray - pd_categorical_array.h:872 - :ref:`View ` * - ``std::unordered_map build_category_map() const`` - std::unordered_map - pd_categorical_array.h:59 - * - ``const std::vector& categories() const`` - const std::vector& - pd_categorical_array.h:473 - :ref:`View ` * - ``const std::string& categories_dtype_str() const`` - const std::string& - pd_categorical_array.h:492 - * - ``void check_for_ordered(const std::string& op) const`` - void - pd_categorical_array.h:1427 - :ref:`View ` * - ``check_for_ordered("searchsorted")`` - - pd_categorical_array.h:1974 - :ref:`View ` * - ``check_for_ordered("sort_values")`` - - pd_categorical_array.h:2071 - :ref:`View ` * - ``const numpy::NDArray& codes() const`` - const numpy::NDArray& - pd_categorical_array.h:454 - :ref:`View ` * - ``std::vector codes_vector() const`` - std::vector - pd_categorical_array.h:461 - * - ``CategoricalArray delete_(size_t loc, std::optional axis = std::nullopt) const`` - CategoricalArray - pd_categorical_array.h:1438 - :ref:`View ` * - ``CategoricalArray delete_(const std::vector& locs) const`` - CategoricalArray - pd_categorical_array.h:1461 - :ref:`View ` * - ``CategoricalDtype dtype() const`` - CategoricalDtype - pd_categorical_array.h:401 - :ref:`View ` * - ``bool empty() const`` - bool - pd_categorical_array.h:440 - :ref:`View ` * - ``std::pair, CategoricalArray> factorize() const`` - std::pair, CategoricalArray> - pd_categorical_array.h:969 - :ref:`View ` * - ``bool has_na() const`` - bool - pd_categorical_array.h:622 - :ref:`View ` * - ``std::vector internal_get_values() const`` - std::vector - pd_categorical_array.h:2198 - * - ``size_t memory_usage(bool deep = false) const`` - size_t - pd_categorical_array.h:1865 - :ref:`View ` * - ``const std::optional& name() const`` - const std::optional& - pd_categorical_array.h:484 - :ref:`View ` * - ``size_t nbytes() const`` - size_t - pd_categorical_array.h:415 - :ref:`View ` * - ``constexpr int ndim() const`` - constexpr int - pd_categorical_array.h:426 - :ref:`View ` * - ``bool ordered() const`` - bool - pd_categorical_array.h:480 - :ref:`View ` * - ``CategoricalArray ravel() const`` - CategoricalArray - pd_categorical_array.h:1892 - :ref:`View ` * - ``CategoricalArray remove_categories(const std::vector& removals) const`` - CategoricalArray - pd_categorical_array.h:657 - :ref:`View ` * - ``CategoricalArray remove_unused_categories() const`` - CategoricalArray - pd_categorical_array.h:823 - :ref:`View ` * - ``CategoricalArray reorder_categories(const std::vector& new_categories, std::optional ordered = std::nullopt) const`` - CategoricalArray - pd_categorical_array.h:783 - :ref:`View ` * - ``CategoricalArray repeat(size_t repeats, std::optional axis = std::nullopt) const`` - CategoricalArray - pd_categorical_array.h:1901 - :ref:`View ` * - ``CategoricalArray repeat(const std::vector& repeats) const`` - CategoricalArray - pd_categorical_array.h:1921 - :ref:`View ` * - ``std::string repr() const`` - std::string - pd_categorical_array.h:2238 - :ref:`View ` * - ``CategoricalArray reshape(const std::vector& new_shape) const`` - CategoricalArray - pd_categorical_array.h:1950 - :ref:`View ` * - ``CategoricalArray set_categories(const std::vector& new_categories, std::optional ordered = std::nullopt, bool rename = false) const`` - CategoricalArray - pd_categorical_array.h:700 - :ref:`View ` * - ``void set_categories_dtype(const std::string& dtype)`` - void - pd_categorical_array.h:496 - * - ``void set_name(const std::string& name)`` - void - pd_categorical_array.h:488 - :ref:`View ` * - ``CategoricalArray set_ordered(bool value) const`` - CategoricalArray - pd_categorical_array.h:2021 - :ref:`View ` * - ``std::vector shape() const`` - std::vector - pd_categorical_array.h:433 - :ref:`View ` * - ``size_t size() const`` - size_t - pd_categorical_array.h:408 - :ref:`View ` * - ``CategoricalArray slice(size_t start, size_t stop, size_t step = 1) const`` - CategoricalArray - pd_categorical_array.h:1342 - :ref:`View ` * - ``void validate_codes() const`` - void - pd_categorical_array.h:70 - Internal Methods ---------------- *1 internal methods (prefixed with underscore)* Code Examples ------------- The following examples are extracted from the test suite. .. _example-categoricalarray-categoricalarray-0: .. dropdown:: CategoricalArray (pd_test_3_all.cpp:28514) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 28504 :emphasize-lines: 11 static int cgo_check(bool cond, const char* msg) { if (!cond) { std::cout << " FAIL: " << msg << std::endl; return 1; } return 0; } static pandas::CategoricalArray make_abc() { std::vector> v{ std::string("a"), std::string("b"), std::string("c"), std::string("a") }; return pandas::CategoricalArray(v, false); } void pd_test_cat_rename_dict() { std::cout << " -- pd_test_cat_rename_dict --" << std::endl; int fail = 0; auto arr = make_abc(); std::unordered_map m{{"a", "A"}, {"b", "B"}}; auto r = arr.rename_categories(m); const auto& cats = r.categories(); fail += cgo_check(cats.size() == 3, "size==3"); .. _example-categoricalarray-categoricalarray-1: .. dropdown:: CategoricalArray (pd_test_3_all.cpp:28514) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 28504 :emphasize-lines: 11 static int cgo_check(bool cond, const char* msg) { if (!cond) { std::cout << " FAIL: " << msg << std::endl; return 1; } return 0; } static pandas::CategoricalArray make_abc() { std::vector> v{ std::string("a"), std::string("b"), std::string("c"), std::string("a") }; return pandas::CategoricalArray(v, false); } void pd_test_cat_rename_dict() { std::cout << " -- pd_test_cat_rename_dict --" << std::endl; int fail = 0; auto arr = make_abc(); std::unordered_map m{{"a", "A"}, {"b", "B"}}; auto r = arr.rename_categories(m); const auto& cats = r.categories(); fail += cgo_check(cats.size() == 3, "size==3"); .. _example-categoricalarray-categoricalarray-2: .. dropdown:: CategoricalArray (pd_test_3_all.cpp:28514) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 28504 :emphasize-lines: 11 static int cgo_check(bool cond, const char* msg) { if (!cond) { std::cout << " FAIL: " << msg << std::endl; return 1; } return 0; } static pandas::CategoricalArray make_abc() { std::vector> v{ std::string("a"), std::string("b"), std::string("c"), std::string("a") }; return pandas::CategoricalArray(v, false); } void pd_test_cat_rename_dict() { std::cout << " -- pd_test_cat_rename_dict --" << std::endl; int fail = 0; auto arr = make_abc(); std::unordered_map m{{"a", "A"}, {"b", "B"}}; auto r = arr.rename_categories(m); const auto& cats = r.categories(); fail += cgo_check(cats.size() == 3, "size==3"); .. _example-categoricalarray-categoricalarray-3: .. dropdown:: CategoricalArray (pd_test_3_all.cpp:28514) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 28504 :emphasize-lines: 11 static int cgo_check(bool cond, const char* msg) { if (!cond) { std::cout << " FAIL: " << msg << std::endl; return 1; } return 0; } static pandas::CategoricalArray make_abc() { std::vector> v{ std::string("a"), std::string("b"), std::string("c"), std::string("a") }; return pandas::CategoricalArray(v, false); } void pd_test_cat_rename_dict() { std::cout << " -- pd_test_cat_rename_dict --" << std::endl; int fail = 0; auto arr = make_abc(); std::unordered_map m{{"a", "A"}, {"b", "B"}}; auto r = arr.rename_categories(m); const auto& cats = r.categories(); fail += cgo_check(cats.size() == 3, "size==3"); .. _example-categoricalarray-from_codes-4: .. dropdown:: from_codes (pd_test_1_all.cpp:403) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 393 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_from_codes() { std::cout << "========= CategoricalArray: from_codes ======================= "; std::vector cats = {"a", "b", "c"}; std::vector codes = {0, 1, 2, 0, 1, -1}; // -1 is NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, false); if (arr.size() != 6) { std::cout << " [FAIL] : in pd_test_categorical_array_from_codes() : size != 6" << std::endl; throw std::runtime_error("pd_test_categorical_array_from_codes failed: size != 6"); } // Check that code=-1 creates NA if (!arr.is_na(5)) { std::cout << " [FAIL] : in pd_test_categorical_array_from_codes() : code -1 should be NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_from_codes failed: code -1 should be NA"); .. _example-categoricalarray-take-5: .. dropdown:: take (pd_test_1_all.cpp:5903) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5893 :emphasize-lines: 11 // Inherited Operations Tests // ============================================================================ void pd_test_categorical_index_take() { std::cout << "========= inherited take =============================="; pandas::CategoricalArray arr({"a", "b", "c", "d"}); pandas::CategoricalIndex idx(arr); std::vector indices = {0, 2, 3}; pandas::ExtensionIndex taken = idx.take(indices); bool passed = (taken.size() == 3); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_take()" << std::endl; throw std::runtime_error("pd_test_categorical_index_take failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-dropna-6: .. dropdown:: dropna (pd_test_1_all.cpp:531) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 521 :emphasize-lines: 11 } // Test isna array numpy::NDArray na_mask = arr.isna(); if (na_mask.getSize() != 4) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : isna size != 4" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: isna size != 4"); } // Test dropna pandas::CategoricalArray dropped = arr.dropna(); if (dropped.size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : dropna size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: dropna size != 2"); } // Test fillna (fill with existing category) pandas::CategoricalArray filled = arr.fillna("a"); // 'a' is in categories if (filled.has_na()) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : fillna should have no NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: fillna should have no NA"); .. _example-categoricalarray-insert-7: .. dropdown:: insert (pd_test_1_all.cpp:12028) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12018 :emphasize-lines: 11 } std::cout << " -> tests passed" << std::endl; } void pd_test_index_insert_delete() { std::cout << "========= insert and delete ==========================="; pandas::Index idx{1, 2, 4, 5}; auto inserted = idx.insert(2, 3); bool passed = (inserted.size() == 5); passed = passed && (inserted[2] == 3); auto deleted = inserted.delete_(2); passed = passed && (deleted.size() == 4); passed = passed && deleted.equals(idx); if (!passed) { std::cout << " [FAIL] : in pd_test_index_insert_delete() : insert/delete failed" << std::endl; throw std::runtime_error("pd_test_index_insert_delete failed"); .. _example-categoricalarray-rename_categories-8: .. dropdown:: rename_categories (pd_test_1_all.cpp:655) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 645 :emphasize-lines: 11 void pd_test_categorical_array_rename_categories() { std::cout << "========= CategoricalArray: rename_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; // a, b, a pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Rename categories std::vector new_names = {"alpha", "beta"}; pandas::CategoricalArray result = arr.rename_categories(new_names); // Check categories are renamed const std::vector& result_cats = result.categories(); if (result_cats[0] != "alpha" || result_cats[1] != "beta") { std::cout << " [FAIL] : in pd_test_categorical_array_rename_categories() : categories not renamed" << std::endl; throw std::runtime_error("pd_test_categorical_array_rename_categories failed: categories not renamed"); } // Values should now be renamed std::optional val = result[0]; .. _example-categoricalarray-rename_categories-9: .. dropdown:: rename_categories (pd_test_1_all.cpp:655) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 645 :emphasize-lines: 11 void pd_test_categorical_array_rename_categories() { std::cout << "========= CategoricalArray: rename_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; // a, b, a pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Rename categories std::vector new_names = {"alpha", "beta"}; pandas::CategoricalArray result = arr.rename_categories(new_names); // Check categories are renamed const std::vector& result_cats = result.categories(); if (result_cats[0] != "alpha" || result_cats[1] != "beta") { std::cout << " [FAIL] : in pd_test_categorical_array_rename_categories() : categories not renamed" << std::endl; throw std::runtime_error("pd_test_categorical_array_rename_categories failed: categories not renamed"); } // Values should now be renamed std::optional val = result[0]; .. _example-categoricalarray-fillna-10: .. dropdown:: fillna (pd_test_1_all.cpp:537) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 527 :emphasize-lines: 11 throw std::runtime_error("pd_test_categorical_array_na_handling failed: isna size != 4"); } // Test dropna pandas::CategoricalArray dropped = arr.dropna(); if (dropped.size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : dropna size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: dropna size != 2"); } // Test fillna (fill with existing category) pandas::CategoricalArray filled = arr.fillna("a"); // 'a' is in categories if (filled.has_na()) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : fillna should have no NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: fillna should have no NA"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_add_categories() { .. _example-categoricalarray-interpolate-11: .. dropdown:: interpolate (pd_test_1_all.cpp:24365) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 24355 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_idxmax_idxmin test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_idxmax_idxmin.cpp (end) ----------------------------- // ------------------- pd_test_interpolate.cpp (start) ----------------------------- // dataframe_tests/pd_test_interpolate.cpp // Test file for DataFrame.interpolate() method #include #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives .. _example-categoricalarray-isna-12: .. dropdown:: isna (pd_test_1_all.cpp:524) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 514 :emphasize-lines: 11 throw std::runtime_error("pd_test_categorical_array_na_handling failed: has_na() should be true"); } // Test count (non-NA) if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : count() != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: count() != 2"); } // Test isna array numpy::NDArray na_mask = arr.isna(); if (na_mask.getSize() != 4) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : isna size != 4" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: isna size != 4"); } // Test dropna pandas::CategoricalArray dropped = arr.dropna(); if (dropped.size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : dropna size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: dropna size != 2"); .. _example-categoricalarray-isnull-13: .. dropdown:: isnull (pd_test_3_all.cpp:671) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 661 :emphasize-lines: 11 // Category 5: Index Null Detection // ============================================================================ void pd_test_3_all_index_null_detection() { std::cout << "========= Index.isnull/notnull() ====================="; // Test with float index (can have NaN) std::vector vals = {1.0, std::nan(""), 3.0, std::nan("")}; pandas::Index idx(vals); numpy::NDArray isnull_result = idx.isnull(); if (isnull_result.getSize() != 4) { std::cout << " [FAIL] : in pd_test_3_all_index_null_detection() : isnull() size mismatch" << std::endl; throw std::runtime_error("pd_test_3_all_index_null_detection failed: isnull() size"); } // Index 0: 1.0 -> not null if (isnull_result.getElementAt({0})) { std::cout << " [FAIL] : in pd_test_3_all_index_null_detection() : index 0 should not be null" << std::endl; throw std::runtime_error("pd_test_3_all_index_null_detection failed: index 0"); } // Index 1: NaN -> null .. _example-categoricalarray-notna-14: .. dropdown:: notna (pd_test_1_all.cpp:6595) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6585 :emphasize-lines: 11 if (!na_mask.getElementAt({2, 1})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (2,1) should be true" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: isna at (2,1)"); } // Row 0, col 0 should NOT be NA if (na_mask.getElementAt({0, 0})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (0,0) should be false" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: isna at (0,0)"); } auto notna_mask = df_na.notna(); if (notna_mask.getElementAt({1, 0})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : notna at (1,0) should be false" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: notna at (1,0)"); } } // Test fillna { std::map> float_data; float_data["X"] = {1.0, std::nan(""), 3.0}; .. _example-categoricalarray-notnull-15: .. dropdown:: notnull (pd_test_3_all.cpp:665) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 655 :emphasize-lines: 11 } std::cout << " -> tests passed" << std::endl; } // ============================================================================ // Category 5: Index Null Detection // ============================================================================ void pd_test_3_all_index_null_detection() { std::cout << "========= Index.isnull/notnull() ====================="; // Test with float index (can have NaN) std::vector vals = {1.0, std::nan(""), 3.0, std::nan("")}; pandas::Index idx(vals); numpy::NDArray isnull_result = idx.isnull(); if (isnull_result.getSize() != 4) { std::cout << " [FAIL] : in pd_test_3_all_index_null_detection() : isnull() size mismatch" << std::endl; throw std::runtime_error("pd_test_3_all_index_null_detection failed: isnull() size"); } .. _example-categoricalarray-count-16: .. dropdown:: count (pd_test_1_all.cpp:66) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 56 :emphasize-lines: 11 if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_and() { std::cout << "========= BooleanArray: Kleene AND ======================= "; .. _example-categoricalarray-describe-17: .. dropdown:: describe (pd_test_2_all.cpp:19793) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 19783 :emphasize-lines: 11 ++g_fail; } } static bool approx_eq(double a, double b, double tol = 1e-9) { if (std::isnan(a) && std::isnan(b)) return true; return std::abs(a - b) < tol; } // ===================================================================== // Test: describe() default mode — numeric columns only // ===================================================================== void pd_test_describe_numeric_only() { std::cout << " -- pd_test_describe_numeric_only --" << std::endl; pandas::DataFrame df; df.add_column("A", std::vector{1.0, 2.0, 3.0, 4.0, 5.0}); df.add_column("B", std::vector{10.0, 20.0, 30.0, 40.0, 50.0}); df.add_column("Name", std::vector{"a", "b", "c", "d", "e"}); .. _example-categoricalarray-max-18: .. dropdown:: max (pd_test_1_all.cpp:771) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 761 :emphasize-lines: 11 pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); .. _example-categoricalarray-min-19: .. dropdown:: min (pd_test_1_all.cpp:764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 754 :emphasize-lines: 11 } void pd_test_categorical_array_ordered_operations() { std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= "; std::vector cats = {"low", "medium", "high"}; std::vector codes = {0, 2, 1, 0, -1}; // low, high, medium, low, NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); .. _example-categoricalarray-value_counts-20: .. dropdown:: value_counts (pd_test_1_all.cpp:865) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 855 :emphasize-lines: 11 std::vector> values = { std::optional("a"), std::optional("b"), std::optional("a"), std::optional("a"), std::optional("b"), std::nullopt // NA not counted }; pandas::CategoricalArray arr(values); auto [cats, counts] = arr.value_counts(); // Should have 2 categories if (cats.size() != 2 || counts.size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_value_counts() : wrong size" << std::endl; throw std::runtime_error("pd_test_categorical_array_value_counts failed: wrong size"); } // Find 'a' count int64_t a_count = 0, b_count = 0; for (size_t i = 0; i < cats.size(); ++i) { .. _example-categoricalarray-map-21: .. dropdown:: map (pd_test_1_all.cpp:5839) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5829 :emphasize-lines: 11 // Map Tests // ============================================================================ void pd_test_categorical_index_map() { std::cout << "========= map ========================================="; pandas::CategoricalArray arr({"yes", "no", "yes"}); pandas::CategoricalIndex idx(arr); std::unordered_map mapping = {{"yes", "1"}, {"no", "0"}}; pandas::CategoricalIndex mapped = idx.map(mapping); bool passed = (mapped.has_category("1") && mapped.has_category("0") && !mapped.has_category("yes") && !mapped.has_category("no")); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_map()" << std::endl; throw std::runtime_error("pd_test_categorical_index_map failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-map-22: .. dropdown:: map (pd_test_1_all.cpp:5839) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5829 :emphasize-lines: 11 // Map Tests // ============================================================================ void pd_test_categorical_index_map() { std::cout << "========= map ========================================="; pandas::CategoricalArray arr({"yes", "no", "yes"}); pandas::CategoricalIndex idx(arr); std::unordered_map mapping = {{"yes", "1"}, {"no", "0"}}; pandas::CategoricalIndex mapped = idx.map(mapping); bool passed = (mapped.has_category("1") && mapped.has_category("0") && !mapped.has_category("yes") && !mapped.has_category("no")); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_map()" << std::endl; throw std::runtime_error("pd_test_categorical_index_map failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-add_categories-23: .. dropdown:: add_categories (pd_test_1_all.cpp:555) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 545 :emphasize-lines: 11 } void pd_test_categorical_array_add_categories() { std::cout << "========= CategoricalArray: add_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Add new categories pandas::CategoricalArray result = arr.add_categories({"c", "d"}); if (result.categories().size() != 4) { std::cout << " [FAIL] : in pd_test_categorical_array_add_categories() : new categories size != 4" << std::endl; throw std::runtime_error("pd_test_categorical_array_add_categories failed: new categories size != 4"); } // Original values should be preserved std::optional val = result[0]; if (!val.has_value() || *val != "a") { std::cout << " [FAIL] : in pd_test_categorical_array_add_categories() : value not preserved" << std::endl; throw std::runtime_error("pd_test_categorical_array_add_categories failed: value not preserved"); .. _example-categoricalarray-equals-24: .. dropdown:: equals (pd_test_1_all.cpp:5866) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5856 :emphasize-lines: 11 std::cout << "========= equals ======================================"; pandas::CategoricalArray arr1({"a", "b", "a"}); pandas::CategoricalArray arr2({"a", "b", "a"}); pandas::CategoricalArray arr3({"a", "b", "c"}); pandas::CategoricalIndex idx1(arr1); pandas::CategoricalIndex idx2(arr2); pandas::CategoricalIndex idx3(arr3); bool passed = (idx1.equals(idx2) && !idx1.equals(idx3)); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_equals()" << std::endl; throw std::runtime_error("pd_test_categorical_index_equals failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_identical() { std::cout << "========= identical ==================================="; .. _example-categoricalarray-len-25: .. dropdown:: len (pd_test_3_all.cpp:20867) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20857 :emphasize-lines: 11 auto title_result = s.str().title(); if (title_result[0] != "Hello World" || title_result[1] != "Hello World" || title_result[2] != "Hello World") { std::cout << " [FAIL] : title() failed" << std::endl; throw std::runtime_error("pd_test_str_capitalize_title: title() failed"); } std::cout << " -> tests passed" << std::endl; } // ============================================================================ // Test str().len() // ============================================================================ void pd_test_str_len() { std::cout << "========= Series.str().len() ============================"; pandas::Series s({"a", "bb", "ccc", ""}); auto lens = s.str().len(); if (lens[0] != 1 || lens[1] != 2 || lens[2] != 3 || lens[3] != 0) { std::cout << " [FAIL] : len() failed" << std::endl; .. _example-categoricalarray-argsort-26: .. dropdown:: argsort (pd_test_1_all.cpp:1304) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1294 :emphasize-lines: 11 std::cout << "========= DatetimeArray: sorting ======================= "; pandas::DatetimeArray arr(std::vector{ "2023-06-15", "NaT", "2023-01-01", "2023-12-31" }); // argsort ascending auto indices = arr.argsort(true, "last"); // Expected order: 2023-01-01(2), 2023-06-15(0), 2023-12-31(3), NaT(1) if (indices.getElementAt({0}) != 2) { std::cout << " [FAIL] : argsort: first should be index 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argsort first"); } if (indices.getElementAt({3}) != 1) { std::cout << " [FAIL] : argsort: last should be index 1 (NaT)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: NaT position"); } .. _example-categoricalarray-searchsorted-27: .. dropdown:: searchsorted (pd_test_1_all.cpp:18958) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 18948 :emphasize-lines: 11 // ========================================================================= // Search Tests // ========================================================================= void pd_test_range_index_searchsorted() { std::cout << "========= searchsorted ================================ "; pandas::RangeIndex ri(0, 10, 2); // [0, 2, 4, 6, 8] bool passed = (ri.searchsorted(4, "left") == 2 && ri.searchsorted(4, "right") == 3 && ri.searchsorted(3, "left") == 2 && // 3 would go between 2 and 4 ri.searchsorted(-1, "left") == 0 && // Before all ri.searchsorted(10, "left") == 5); // After all if (!passed) { std::cout << " [FAIL] : searchsorted" << std::endl; throw std::runtime_error("pd_test_range_index_searchsorted failed"); } .. _example-categoricalarray-sort_values-28: .. dropdown:: sort_values (pd_test_1_all.cpp:6408) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6398 :emphasize-lines: 11 void pd_test_dataframe_sorting() { std::cout << "========= sorting =========================="; std::map> data; data["A"] = {3.0, 1.0, 4.0, 1.0, 5.0}; data["B"] = {9.0, 2.0, 6.0, 5.0, 3.0}; pandas::DataFrame df(data); // Test sort_values ascending auto sorted_asc = df.sort_values("A", true); // First value should be smallest (1.0) std::string first_val = sorted_asc["A"].get_value_str(0); if (std::stod(first_val) != 1.0) { std::cout << " [FAIL] : in pd_test_dataframe_sorting() : sort_values asc first != 1" << std::endl; throw std::runtime_error("pd_test_dataframe_sorting failed: sort_values asc first != 1"); } // Test sort_values descending auto sorted_desc = df.sort_values("A", false); first_val = sorted_desc["A"].get_value_str(0); .. _example-categoricalarray-t-29: .. dropdown:: T (pd_test_1_all.cpp:128) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 118 :emphasize-lines: 11 throw std::runtime_error("pd_test_boolean_array_kleene_and failed: NA & F"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_or() { std::cout << "========= BooleanArray: Kleene OR ======================= "; // Kleene OR truth table: // T | T = T, T | F = T, T | NA = T (True dominates) // F | T = T, F | F = F, F | NA = NA // NA | T = T, NA | F = NA, NA | NA = NA pandas::BooleanArray t({std::optional(true)}); pandas::BooleanArray f({std::optional(false)}); pandas::BooleanArray na({std::nullopt}); // T | NA = T (True dominates) auto tna = (t | na); if (!tna[0].has_value() || !tna[0].value()) { .. _example-categoricalarray-swapaxes-30: .. dropdown:: swapaxes (pd_test_3_all.cpp:2276) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2266 :emphasize-lines: 11 auto sorted_desc = arr.sort_values(false, "last"); if (*sorted_desc[0] != "c" || *sorted_desc[1] != "b" || *sorted_desc[2] != "a" || sorted_desc[3].has_value()) { throw std::runtime_error("sort_values descending failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_swapaxes() { std::cout << "========= CategoricalArray.swapaxes() ================="; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray arr(values); auto result = arr.swapaxes(0, 0); if (result.size() != 3) { throw std::runtime_error("swapaxes failed"); } bool threw = false; .. _example-categoricalarray-transpose-31: .. dropdown:: transpose (pd_test_1_all.cpp:16648) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16638 :emphasize-lines: 11 std::cout << " [FAIL] : in pd_test_ndframe_transpose() : T_() size" << std::endl; throw std::runtime_error("pd_test_ndframe_transpose failed: T_() size"); } passed = transposed[0] == 1 && transposed[1] == 2 && transposed[2] == 3; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_transpose() : T_() values" << std::endl; throw std::runtime_error("pd_test_ndframe_transpose failed: T_() values"); } // Test transpose() alias auto transposed2 = s.transpose(); passed = transposed2.size() == s.size(); if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_transpose() : transpose() size" << std::endl; throw std::runtime_error("pd_test_ndframe_transpose failed: transpose() size"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-concat-32: .. dropdown:: concat (pd_test_1_all.cpp:17717) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 17707 :emphasize-lines: 11 } void pd_test_period_index_concat() { std::cout << "========= concat factory =============================="; std::vector ordinals1 = {0, 1}; std::vector ordinals2 = {2, 3}; pandas::PeriodIndex idx1(ordinals1, "D"); pandas::PeriodIndex idx2(ordinals2, "D"); pandas::PeriodIndex concatenated = pandas::PeriodIndex::concat({idx1, idx2}); bool passed = (concatenated.size() == 4); if (!passed) { std::cout << " [FAIL] : in pd_test_period_index_concat()" << std::endl; throw std::runtime_error("pd_test_period_index_concat failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-concat_merge-33: .. dropdown:: concat_merge (pd_test_3_all.cpp:26636) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 26626 :emphasize-lines: 11 const auto& mi = result.multiindex(); if (mi.get_level_values_str(0)[0] != "x") throw std::runtime_error("Key 'x' not found at level 0"); if (mi.get_level_values_str(0)[2] != "y") throw std::runtime_error("Key 'y' not found at level 0"); std::cout << " -> test passed" << std::endl; } void test_categorical_array_concat_merge() { std::cout << " test_categorical_array_concat_merge" << std::endl; auto cat1 = CategoricalArray::from_codes({0, 1}, {"a", "b"}); auto cat2 = CategoricalArray::from_codes({0, 1}, {"b", "c"}); auto result = CategoricalArray::concat_merge({cat1, cat2}); if (result.categories().size() != 3) throw std::runtime_error("Expected 3 merged categories"); if (result.categories()[0] != "a" || result.categories()[1] != "b" || result.categories()[2] != "c") throw std::runtime_error("Merged categories wrong"); if (result.size() != 4) throw std::runtime_error("Expected 4 elements"); std::cout << " -> test passed" << std::endl; } int pd_test_concat_ext_main() { try { std::cout << "========= concat extension tests =========" << std::endl; .. _example-categoricalarray-shift-34: .. dropdown:: shift (pd_test_1_all.cpp:5188) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5178 :emphasize-lines: 11 // First element should be NaN val = d["A"].get_value_str(0); passed = std::isnan(std::stod(val)); if (!passed) { std::cout << " [FAIL] : in pd_test_arithmetic_dataframe_diff_shift() : diff NaN failed" << std::endl; throw std::runtime_error("pd_test_arithmetic_dataframe_diff_shift failed: diff NaN failed"); } // shift: [NaN, 1, 3, 6] auto s = df.shift(); val = s["A"].get_value_str(1); passed = std::abs(std::stod(val) - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_arithmetic_dataframe_diff_shift() : shift failed" << std::endl; throw std::runtime_error("pd_test_arithmetic_dataframe_diff_shift failed: shift failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-to_list-35: .. dropdown:: to_list (pd_test_1_all.cpp:10247) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10237 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_to_list() { std::cout << "========= to_list ========================="; pandas::CategoricalArray arr({"x", "y", "z"}); pandas::CategoricalIndex idx(arr); auto list = idx.to_list(); bool passed = (list.size() == 3 && list[0].has_value() && *list[0] == "x" && list[1].has_value() && *list[1] == "y" && list[2].has_value() && *list[2] == "z"); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_to_list() : to_list check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_to_list failed"); } .. _example-categoricalarray-to_numpy-36: .. dropdown:: to_numpy (pd_test_1_all.cpp:16764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16754 :emphasize-lines: 11 // ===================================================================== // to_numpy Tests // ===================================================================== void pd_test_ndframe_to_numpy() { std::cout << "========= to_numpy =============================================" << std::endl; pandas::Series s({10, 20, 30}); auto arr = s.to_numpy(); bool passed = arr.getSize() == 3; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_to_numpy() : size" << std::endl; throw std::runtime_error("pd_test_ndframe_to_numpy failed: size"); } passed = arr.getElementAt({0}) == 10 && arr.getElementAt({1}) == 20 && arr.getElementAt({2}) == 30; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_to_numpy() : values" << std::endl; .. _example-categoricalarray-to_string-37: .. dropdown:: to_string (pd_test_1_all.cpp:2693) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2683 :emphasize-lines: 11 pandas::PeriodArray arr_m(std::vector{ "2020-01", "NaT", "2025-06" }, "M"); // Year auto years = arr_m.year(); auto y0 = years[0]; if (!y0.has_value() || y0.value() != 2020) { std::cout << " [FAIL] : year[0] should be 2020, got " << (y0.has_value() ? std::to_string(y0.value()) : "NA") << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[0]"); } auto y1 = years[1]; if (y1.has_value()) { std::cout << " [FAIL] : year[1] should be NA (NaT)" << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[1] should be NA"); } auto y2 = years[2]; .. _example-categoricalarray-tolist-38: .. dropdown:: tolist (pd_test_3_all.cpp:2300) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2290 :emphasize-lines: 11 threw = true; } if (!threw) { throw std::runtime_error("swapaxes should throw for invalid axes"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_to_list() { std::cout << "========= CategoricalArray.to_list()/tolist() ========="; std::vector> values = {"a", "b", std::nullopt, "c"}; pandas::CategoricalArray arr(values); auto list = arr.to_list(); if (list.size() != 4 || *list[0] != "a" || *list[1] != "b" || list[2].has_value() || *list[3] != "c") { throw std::runtime_error("to_list failed"); } .. _example-categoricalarray-astype-39: .. dropdown:: astype (pd_test_1_all.cpp:21292) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 21282 :emphasize-lines: 11 std::cout << "========= astype all columns to float64 ============="; // Create DataFrame with int64 columns std::map> data; data["A"] = {1, 2, 3, 4, 5}; data["B"] = {10, 20, 30, 40, 50}; pandas::DataFrame df(data); // Convert all columns to float64 pandas::DataFrame df_float = df.astype("float64"); // Verify dtype changed pandas::Series dtypes = df_float.dtypes(); bool passed = true; if (dtypes[static_cast(0)] != "float64") { std::cout << " [FAIL] : in pd_test_astype_all_columns_to_float64() : column A dtype is " << dtypes[static_cast(0)] << ", expected float64" << std::endl; passed = false; } if (dtypes[static_cast(1)] != "float64") { .. _example-categoricalarray-astype_codes-40: .. dropdown:: astype_codes (pd_test_3_all.cpp:1822) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1812 :emphasize-lines: 11 std::cout << "========= CategoricalArray.astype() =================="; std::vector> values = {"a", "b", "c", "a", std::nullopt}; pandas::CategoricalArray arr(values); auto str_result = arr.astype("str"); if (str_result.size() != 5 || !str_result[0].has_value() || *str_result[0] != "a" || str_result[4].has_value()) { throw std::runtime_error("astype failed"); } auto codes = arr.astype_codes(); if (codes.getSize() != 5) { throw std::runtime_error("astype_codes failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_check_ordered() { std::cout << "========= CategoricalArray.check_for_ordered() ========"; .. _example-categoricalarray-copy-41: .. dropdown:: copy (pd_test_1_all.cpp:5798) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5788 :emphasize-lines: 11 // ============================================================================ // Copy/Rename Tests // ============================================================================ void pd_test_categorical_index_copy() { std::cout << "========= copy ========================================"; pandas::CategoricalArray arr({"a", "b", "c"}); pandas::CategoricalIndex idx(arr, "original"); pandas::CategoricalIndex copied = idx.copy(); bool passed = (copied.size() == idx.size() && copied.name() == idx.name() && copied.categories() == idx.categories() && copied.ordered() == idx.ordered()); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_copy()" << std::endl; throw std::runtime_error("pd_test_categorical_index_copy failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-view-42: .. dropdown:: view (pd_test_3_all.cpp:2147) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2137 :emphasize-lines: 11 throw std::runtime_error("memory_usage shallow too small"); } if (deep < shallow) { throw std::runtime_error("memory_usage deep should be >= shallow"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_ravel_view() { std::cout << "========= CategoricalArray.ravel()/view() ============="; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray arr(values); auto raveled = arr.ravel(); if (raveled.size() != 3 || !raveled.equals(arr)) { throw std::runtime_error("ravel failed"); } auto viewed = arr.view(); .. _example-categoricalarray-duplicated-43: .. dropdown:: duplicated (pd_test_1_all.cpp:10583) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10573 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_duplicated() { std::cout << "========= duplicated ========================="; pandas::CategoricalArray arr({"a", "b", "a", "c", "a"}); pandas::CategoricalIndex idx(arr); auto dup_mask = idx.duplicated("first"); bool passed = (dup_mask.getElementAt({0}) == false && dup_mask.getElementAt({1}) == false && dup_mask.getElementAt({2}) == true && dup_mask.getElementAt({3}) == false && dup_mask.getElementAt({4}) == true); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_duplicated() : duplicated check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_duplicated failed"); } .. _example-categoricalarray-isin-44: .. dropdown:: isin (pd_test_1_all.cpp:5938) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5928 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_isin() { std::cout << "========= inherited isin =============================="; pandas::CategoricalArray arr({"a", "b", "c", "d"}); pandas::CategoricalIndex idx(arr); std::vector values = {"a", "c"}; numpy::NDArray mask = idx.isin(values); bool passed = (mask.getSize() == 4 && mask.getElementAt({0}) == true && // a mask.getElementAt({1}) == false && // b mask.getElementAt({2}) == true && // c mask.getElementAt({3}) == false); // d if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_isin()" << std::endl; throw std::runtime_error("pd_test_categorical_index_isin failed"); } .. _example-categoricalarray-unique-45: .. dropdown:: unique (pd_test_1_all.cpp:1345) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1335 :emphasize-lines: 11 pandas::DatetimeArray arr(std::vector{ "2023-01-01", "2023-06-15", "2023-01-01", "NaT", "2023-06-15", "NaT" }); // unique auto uniq = arr.unique(); // Should have: NaT, 2023-01-01, 2023-06-15 (3 unique values) if (uniq.size() != 3) { std::cout << " [FAIL] : unique size should be 3, got " << uniq.size() << std::endl; throw std::runtime_error("pd_test_datetime_array_unique failed: size"); } // factorize auto [codes, uniques] = arr.factorize(); // Codes for NaT should be -1 if (codes.getElementAt({3}) != -1) { .. _example-categoricalarray-is_na-46: .. dropdown:: is_na (pd_test_1_all.cpp:51) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 41 :emphasize-lines: 11 void pd_test_boolean_array_na_handling() { std::cout << "========= BooleanArray: NA handling ======================= "; pandas::BooleanArray arr({ std::optional(true), std::nullopt, // NA at index 1 std::optional(false) }); if (!arr.is_na(1)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(1) should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(1) should be true"); } if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { .. _example-categoricalarray-argmax-47: .. dropdown:: argmax (pd_test_1_all.cpp:1323) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1313 :emphasize-lines: 11 } // argmin auto min_idx = arr.argmin(); if (!min_idx.has_value() || min_idx.value() != 2) { std::cout << " [FAIL] : argmin should be 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmin"); } // argmax auto max_idx = arr.argmax(); if (!max_idx.has_value() || max_idx.value() != 3) { std::cout << " [FAIL] : argmax should be 3 (2023-12-31)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmax"); } std::cout << " -> tests passed" << std::endl; } void pd_test_datetime_array_unique() { std::cout << "========= DatetimeArray: unique/factorize ======================= "; .. _example-categoricalarray-argmin-48: .. dropdown:: argmin (pd_test_1_all.cpp:1316) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1306 :emphasize-lines: 11 if (indices.getElementAt({0}) != 2) { std::cout << " [FAIL] : argsort: first should be index 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argsort first"); } if (indices.getElementAt({3}) != 1) { std::cout << " [FAIL] : argsort: last should be index 1 (NaT)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: NaT position"); } // argmin auto min_idx = arr.argmin(); if (!min_idx.has_value() || min_idx.value() != 2) { std::cout << " [FAIL] : argmin should be 2 (2023-01-01)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmin"); } // argmax auto max_idx = arr.argmax(); if (!max_idx.has_value() || max_idx.value() != 3) { std::cout << " [FAIL] : argmax should be 3 (2023-12-31)" << std::endl; throw std::runtime_error("pd_test_datetime_array_sorting failed: argmax"); .. _example-categoricalarray-as_ordered-49: .. dropdown:: as_ordered (pd_test_1_all.cpp:791) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 781 :emphasize-lines: 11 unordered.min(); } catch (const std::exception&) { threw = true; } if (!threw) { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : unordered min should throw" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: unordered min should throw"); } // Test as_ordered / as_unordered pandas::CategoricalArray reordered = unordered.as_ordered(); if (!reordered.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : as_ordered failed" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: as_ordered failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_comparisons() { std::cout << "========= CategoricalArray: comparisons ======================= "; .. _example-categoricalarray-as_unordered-50: .. dropdown:: as_unordered (pd_test_1_all.cpp:778) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 768 :emphasize-lines: 11 } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); } catch (const std::exception&) { threw = true; } if (!threw) { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : unordered min should throw" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: unordered min should throw"); } .. _example-categoricalarray-categories-51: .. dropdown:: categories (pd_test_1_all.cpp:389) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 379 :emphasize-lines: 11 std::vector> vals = { std::optional("low"), std::optional("high"), std::optional("medium") }; pandas::CategoricalArray arr3(vals, cats, true); // ordered if (!arr3.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : should be ordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: should be ordered"); } if (arr3.categories().size() != 3) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : categories size != 3" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: categories size != 3"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_from_codes() { std::cout << "========= CategoricalArray: from_codes ======================= "; .. _example-categoricalarray-check_for_ordered-52: .. dropdown:: check_for_ordered (pd_test_3_all.cpp:1831) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1821 :emphasize-lines: 11 auto codes = arr.astype_codes(); if (codes.getSize() != 5) { throw std::runtime_error("astype_codes failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_check_ordered() { std::cout << "========= CategoricalArray.check_for_ordered() ========"; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray unordered_arr(values, false); pandas::CategoricalArray ordered_arr(values, {"a", "b", "c"}, true); bool threw = false; try { unordered_arr.check_for_ordered("test_op"); } catch (const std::exception&) { threw = true; .. _example-categoricalarray-check_for_ordered-53: .. dropdown:: check_for_ordered (pd_test_3_all.cpp:1831) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1821 :emphasize-lines: 11 auto codes = arr.astype_codes(); if (codes.getSize() != 5) { throw std::runtime_error("astype_codes failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_check_ordered() { std::cout << "========= CategoricalArray.check_for_ordered() ========"; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray unordered_arr(values, false); pandas::CategoricalArray ordered_arr(values, {"a", "b", "c"}, true); bool threw = false; try { unordered_arr.check_for_ordered("test_op"); } catch (const std::exception&) { threw = true; .. _example-categoricalarray-check_for_ordered-54: .. dropdown:: check_for_ordered (pd_test_3_all.cpp:1831) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1821 :emphasize-lines: 11 auto codes = arr.astype_codes(); if (codes.getSize() != 5) { throw std::runtime_error("astype_codes failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_check_ordered() { std::cout << "========= CategoricalArray.check_for_ordered() ========"; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray unordered_arr(values, false); pandas::CategoricalArray ordered_arr(values, {"a", "b", "c"}, true); bool threw = false; try { unordered_arr.check_for_ordered("test_op"); } catch (const std::exception&) { threw = true; .. _example-categoricalarray-codes-55: .. dropdown:: codes (pd_test_1_all.cpp:473) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 463 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_codes_property() { std::cout << "========= CategoricalArray: codes property ======================= "; std::vector cats = {"x", "y", "z"}; std::vector codes = {0, 1, 2, 1, 0}; pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); numpy::NDArray arr_codes = arr.codes(); if (arr_codes.getSize() != 5) { std::cout << " [FAIL] : in pd_test_categorical_array_codes_property() : codes size != 5" << std::endl; throw std::runtime_error("pd_test_categorical_array_codes_property failed: codes size != 5"); } // Check codes match for (size_t i = 0; i < codes.size(); ++i) { if (arr_codes.getElementAt({i}) != codes[i]) { std::cout << " [FAIL] : in pd_test_categorical_array_codes_property() : code mismatch at " << i << std::endl; throw std::runtime_error("pd_test_categorical_array_codes_property failed: code mismatch"); .. _example-categoricalarray-delete_-56: .. dropdown:: delete_ (pd_test_1_all.cpp:10501) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10491 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_delete() { std::cout << "========= delete_ ========================="; pandas::CategoricalArray arr({"a", "b", "c", "d"}); pandas::CategoricalIndex idx(arr); auto deleted = idx.delete_(1); auto v0 = deleted[0]; auto v1 = deleted[1]; auto v2 = deleted[2]; bool passed = (deleted.size() == 3 && v0.has_value() && *v0 == "a" && v1.has_value() && *v1 == "c" && v2.has_value() && *v2 == "d"); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_delete() : delete_ check failed" << std::endl; .. _example-categoricalarray-delete_-57: .. dropdown:: delete_ (pd_test_1_all.cpp:10501) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10491 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_delete() { std::cout << "========= delete_ ========================="; pandas::CategoricalArray arr({"a", "b", "c", "d"}); pandas::CategoricalIndex idx(arr); auto deleted = idx.delete_(1); auto v0 = deleted[0]; auto v1 = deleted[1]; auto v2 = deleted[2]; bool passed = (deleted.size() == 3 && v0.has_value() && *v0 == "a" && v1.has_value() && *v1 == "c" && v2.has_value() && *v2 == "d"); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_delete() : delete_ check failed" << std::endl; .. _example-categoricalarray-dtype-58: .. dropdown:: dtype (pd_test_1_all.cpp:295) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 285 :emphasize-lines: 11 throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; pandas::BooleanArray arr; if (arr.dtype().name() != "boolean") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype name should be 'boolean'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype name"); } if (arr.dtype().kind() != "b") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype kind should be 'b'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype kind"); } std::cout << " -> tests passed" << std::endl; .. _example-categoricalarray-empty-59: .. dropdown:: empty (pd_test_1_all.cpp:941) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 931 :emphasize-lines: 11 #include "../pandas/pd_config.h" namespace dataframe_tests { namespace dataframe_tests_config { void pd_test_config_version() { std::cout << "========= df_config: version info ======================= "; const char* version = pandas::DataFrameInfo::version(); if (version == nullptr || std::string(version).empty()) { std::cout << "[FAIL] : in pd_test_config_version() : version is null or empty" << std::endl; throw std::runtime_error("pd_test_config_version failed: version is null or empty"); } std::cout << "-> tests passed" << std::endl; } void pd_test_config_na_repr() { std::cout << "========= df_config: NA representation ======================= "; const char* na_repr = pandas::DataFrameConfig::get_na_repr(); if (na_repr == nullptr) { .. _example-categoricalarray-factorize-60: .. dropdown:: factorize (pd_test_1_all.cpp:1353) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1343 :emphasize-lines: 11 // unique auto uniq = arr.unique(); // Should have: NaT, 2023-01-01, 2023-06-15 (3 unique values) if (uniq.size() != 3) { std::cout << " [FAIL] : unique size should be 3, got " << uniq.size() << std::endl; throw std::runtime_error("pd_test_datetime_array_unique failed: size"); } // factorize auto [codes, uniques] = arr.factorize(); // Codes for NaT should be -1 if (codes.getElementAt({3}) != -1) { std::cout << " [FAIL] : factorize: NaT code should be -1" << std::endl; throw std::runtime_error("pd_test_datetime_array_unique failed: NaT code"); } // Same values should have same codes if (codes.getElementAt({0}) != codes.getElementAt({2})) { std::cout << " [FAIL] : factorize: 2023-01-01 values should have same code" << std::endl; throw std::runtime_error("pd_test_datetime_array_unique failed: same code"); } .. _example-categoricalarray-has_na-61: .. dropdown:: has_na (pd_test_1_all.cpp:61) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 51 :emphasize-lines: 11 if (!arr.is_na(1)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(1) should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(1) should be true"); } if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; .. _example-categoricalarray-memory_usage-62: .. dropdown:: memory_usage (pd_test_1_all.cpp:27063) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 27053 :emphasize-lines: 11 } std::cout << "====================================== [OK] pd_test_value_counts test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_value_counts.cpp (end) ----------------------------- // ------------------- pd_test_memory_usage.cpp (start) ----------------------------- // Tests for DataFrame.memory_usage() - pandas-compatible memory usage reporting namespace dataframe_tests { namespace dataframe_tests_memory_usage { void pd_test_memory_usage_basic() { std::cout << "========= basic memory_usage ======================="; // Create a simple DataFrame with multiple columns std::map> data; data["A"] = {1.0, 2.0, 3.0, 4.0, 5.0}; .. _example-categoricalarray-name-63: .. dropdown:: name (pd_test_1_all.cpp:295) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 285 :emphasize-lines: 11 throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; pandas::BooleanArray arr; if (arr.dtype().name() != "boolean") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype name should be 'boolean'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype name"); } if (arr.dtype().kind() != "b") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype kind should be 'b'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype kind"); } std::cout << " -> tests passed" << std::endl; .. _example-categoricalarray-nbytes-64: .. dropdown:: nbytes (pd_test_1_all.cpp:6214) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6204 :emphasize-lines: 11 } // Test empty DataFrame pandas::DataFrame empty_df; if (!empty_df.empty()) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : should be empty" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: should be empty"); } // Test nbytes > 0 for non-empty if (df.nbytes() == 0) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : nbytes should be > 0" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: nbytes should be > 0"); } // Test columns index if (df.columns().size() != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : columns size != 3" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: columns size != 3"); } .. _example-categoricalarray-ndim-65: .. dropdown:: ndim (pd_test_1_all.cpp:6195) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6185 :emphasize-lines: 11 pandas::DataFrame df(data); // Test shape auto shape = df.shape(); if (shape.size() != 2 || shape[0] != 4 || shape[1] != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : shape mismatch" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: shape mismatch"); } // Test ndim if (df.ndim() != 2) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : ndim != 2" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: ndim != 2"); } // Test empty if (df.empty()) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : should not be empty" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: should not be empty"); } .. _example-categoricalarray-ordered-66: .. dropdown:: ordered (pd_test_1_all.cpp:359) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 349 :emphasize-lines: 11 void pd_test_categorical_array_constructors() { std::cout << "========= CategoricalArray: constructors ======================= "; // Default constructor pandas::CategoricalArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: default constructor size != 0"); } if (arr1.ordered()) { std::cout << " [FAIL] : in pd_test_categorical_array_constructors() : default should be unordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_constructors failed: default should be unordered"); } // Constructor from values (infer categories) std::vector> values = { std::optional("a"), std::optional("b"), std::optional("a"), std::optional("c") .. _example-categoricalarray-ravel-67: .. dropdown:: ravel (pd_test_3_all.cpp:2147) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2137 :emphasize-lines: 11 throw std::runtime_error("memory_usage shallow too small"); } if (deep < shallow) { throw std::runtime_error("memory_usage deep should be >= shallow"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_ravel_view() { std::cout << "========= CategoricalArray.ravel()/view() ============="; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray arr(values); auto raveled = arr.ravel(); if (raveled.size() != 3 || !raveled.equals(arr)) { throw std::runtime_error("ravel failed"); } auto viewed = arr.view(); .. _example-categoricalarray-remove_categories-68: .. dropdown:: remove_categories (pd_test_1_all.cpp:591) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 581 :emphasize-lines: 11 } void pd_test_categorical_array_remove_categories() { std::cout << "========= CategoricalArray: remove_categories ======================= "; std::vector cats = {"a", "b", "c"}; std::vector codes = {0, 1, 2, 1}; // a, b, c, b pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Remove 'c' - values with 'c' become NA pandas::CategoricalArray result = arr.remove_categories({"c"}); if (result.categories().size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_remove_categories() : categories size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_remove_categories failed: categories size != 2"); } // Element at index 2 should now be NA (was 'c') if (!result.is_na(2)) { std::cout << " [FAIL] : in pd_test_categorical_array_remove_categories() : removed category should be NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_remove_categories failed: removed category should be NA"); .. _example-categoricalarray-remove_unused_categories-69: .. dropdown:: remove_unused_categories (pd_test_1_all.cpp:737) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 727 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_remove_unused_categories() { std::cout << "========= CategoricalArray: remove_unused_categories ======================= "; std::vector cats = {"a", "b", "c", "d"}; std::vector codes = {0, 0, 2}; // a, a, c (b and d unused) pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); pandas::CategoricalArray result = arr.remove_unused_categories(); // Only 'a' and 'c' should remain if (result.categories().size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_remove_unused_categories() : categories size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_remove_unused_categories failed: categories size != 2"); } // Values should be preserved std::optional val0 = result[0]; std::optional val2 = result[2]; .. _example-categoricalarray-reorder_categories-70: .. dropdown:: reorder_categories (pd_test_1_all.cpp:695) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 685 :emphasize-lines: 11 void pd_test_categorical_array_reorder_categories() { std::cout << "========= CategoricalArray: reorder_categories ======================= "; std::vector cats = {"a", "b", "c"}; std::vector codes = {0, 1, 2}; // a, b, c pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Reorder categories std::vector new_order = {"c", "b", "a"}; pandas::CategoricalArray result = arr.reorder_categories(new_order); // Check categories are reordered const std::vector& result_cats = result.categories(); if (result_cats[0] != "c" || result_cats[1] != "b" || result_cats[2] != "a") { std::cout << " [FAIL] : in pd_test_categorical_array_reorder_categories() : categories not reordered" << std::endl; throw std::runtime_error("pd_test_categorical_array_reorder_categories failed: categories not reordered"); } // Values should be preserved std::optional val0 = result[0]; .. _example-categoricalarray-repeat-71: .. dropdown:: repeat (pd_test_3_all.cpp:2166) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2156 :emphasize-lines: 11 auto viewed = arr.view(); if (viewed.size() != 3 || !viewed.equals(arr)) { throw std::runtime_error("view failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_repeat() { std::cout << "========= CategoricalArray.repeat() ==================="; std::vector> values = {"a", "b"}; pandas::CategoricalArray arr(values); auto result = arr.repeat(3); if (result.size() != 6 || *result[0] != "a" || *result[2] != "a" || *result[3] != "b" || *result[5] != "b") { throw std::runtime_error("repeat scalar failed"); } .. _example-categoricalarray-repeat-72: .. dropdown:: repeat (pd_test_3_all.cpp:2166) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2156 :emphasize-lines: 11 auto viewed = arr.view(); if (viewed.size() != 3 || !viewed.equals(arr)) { throw std::runtime_error("view failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_repeat() { std::cout << "========= CategoricalArray.repeat() ==================="; std::vector> values = {"a", "b"}; pandas::CategoricalArray arr(values); auto result = arr.repeat(3); if (result.size() != 6 || *result[0] != "a" || *result[2] != "a" || *result[3] != "b" || *result[5] != "b") { throw std::runtime_error("repeat scalar failed"); } .. _example-categoricalarray-repr-73: .. dropdown:: repr (pd_test_1_all.cpp:10906) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10896 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_repr() { std::cout << "========= repr ========================="; pandas::CategoricalArray arr({"a", "b", "c"}); // Use ExtensionIndex directly to test base class repr pandas::ExtensionIndex idx(arr, "test"); std::string repr_str = idx.repr(); bool passed = (!repr_str.empty() && repr_str.find("ExtensionIndex") != std::string::npos); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_repr() : repr check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_repr failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-categoricalarray-reshape-74: .. dropdown:: reshape (pd_test_3_all.cpp:2186) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2176 :emphasize-lines: 11 auto result2 = arr.repeat({1, 2}); if (result2.size() != 3 || *result2[0] != "a" || *result2[1] != "b" || *result2[2] != "b") { throw std::runtime_error("repeat array failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_reshape() { std::cout << "========= CategoricalArray.reshape() =================="; std::vector> values = {"a", "b", "c", "d"}; pandas::CategoricalArray arr(values); auto result = arr.reshape({4}); if (result.size() != 4) { throw std::runtime_error("reshape failed"); } bool threw = false; .. _example-categoricalarray-set_categories-75: .. dropdown:: set_categories (pd_test_1_all.cpp:623) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 613 :emphasize-lines: 11 void pd_test_categorical_array_set_categories() { std::cout << "========= CategoricalArray: set_categories ======================= "; std::vector cats = {"a", "b"}; std::vector codes = {0, 1, 0}; // a, b, a pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats); // Set new categories (values not in new categories become NA) std::vector new_cats = {"a", "c"}; // 'b' removed, 'c' added pandas::CategoricalArray result = arr.set_categories(new_cats); if (result.categories().size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_set_categories() : categories size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_set_categories failed: categories size != 2"); } // Element at index 1 should be NA (was 'b', now not in categories) if (!result.is_na(1)) { std::cout << " [FAIL] : in pd_test_categorical_array_set_categories() : 'b' value should be NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_set_categories failed: 'b' value should be NA"); .. _example-categoricalarray-set_name-76: .. dropdown:: set_name (pd_test_1_all.cpp:11798) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11788 :emphasize-lines: 11 throw std::runtime_error("pd_test_index_vector_constructor failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_index_copy_constructor() { std::cout << "========= copy constructor ============================"; pandas::Index idx1{1, 2, 3}; idx1.set_name("original"); pandas::Index idx2(idx1); bool passed = (idx2.size() == 3); passed = passed && (idx2.name().value() == "original"); passed = passed && idx2.equals(idx1); if (!passed) { std::cout << " [FAIL] : in pd_test_index_copy_constructor() : copy failed" << std::endl; throw std::runtime_error("pd_test_index_copy_constructor failed"); .. _example-categoricalarray-set_ordered-77: .. dropdown:: set_ordered (pd_test_3_all.cpp:2210) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2200 :emphasize-lines: 11 threw = true; } if (!threw) { throw std::runtime_error("reshape should throw for incompatible shape"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_categorical_set_ordered() { std::cout << "========= CategoricalArray.set_ordered() =============="; std::vector> values = {"a", "b", "c"}; pandas::CategoricalArray arr(values, false); if (arr.ordered()) { throw std::runtime_error("initial should be unordered"); } auto ordered = arr.set_ordered(true); if (!ordered.ordered()) { .. _example-categoricalarray-shape-78: .. dropdown:: shape (pd_test_1_all.cpp:6188) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6178 :emphasize-lines: 11 std::cout << "========= properties ======================="; std::map> data; data["A"] = {1.0, 2.0, 3.0, 4.0}; data["B"] = {5.0, 6.0, 7.0, 8.0}; data["C"] = {9.0, 10.0, 11.0, 12.0}; pandas::DataFrame df(data); // Test shape auto shape = df.shape(); if (shape.size() != 2 || shape[0] != 4 || shape[1] != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : shape mismatch" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: shape mismatch"); } // Test ndim if (df.ndim() != 2) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : ndim != 2" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: ndim != 2"); } .. _example-categoricalarray-size-79: .. dropdown:: size (pd_test_1_all.cpp:22) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12 :emphasize-lines: 11 #include "../pandas/pd_boolean_array.h" namespace dataframe_tests { namespace dataframe_tests_boolean_array { void pd_test_boolean_array_constructors() { std::cout << "========= BooleanArray: constructors ======================= "; // Default constructor pandas::BooleanArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_boolean_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_boolean_array_constructors failed: default constructor size != 0"); } // Initializer list constructor pandas::BooleanArray arr2({ std::optional(true), std::optional(false), std::nullopt, std::optional(true) .. _example-categoricalarray-slice-80: .. dropdown:: slice (pd_test_1_all.cpp:17546) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 17536 :emphasize-lines: 11 // ============================================================================ // Slicing / Indexing Tests // ============================================================================ void pd_test_period_index_slice() { std::cout << "========= slice method ================================"; std::vector ordinals = {0, 1, 2, 3, 4}; pandas::PeriodIndex idx(ordinals, "D"); pandas::PeriodIndex sliced = idx.slice(1, 4); bool passed = (sliced.size() == 3 && sliced[0].has_value() && *sliced[0] == 1); if (!passed) { std::cout << " [FAIL] : in pd_test_period_index_slice()" << std::endl; throw std::runtime_error("pd_test_period_index_slice failed"); } std::cout << " -> tests passed" << std::endl; }