Lexer ===== .. cpp:class:: pandas::Lexer Query engine class for expression evaluation. Example ------- .. code-block:: cpp #include using namespace pandas; // Use Lexer Lexer obj; // ... operations ... Comparison ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Token next_token()`` - Token - pd_query.h:203 - Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``void advance()`` - void - pd_query.h:106 - * - ``char current() const`` - char - pd_query.h:97 - :ref:`View ` * - ``explicit Lexer(std::string input) : input_(std::move(input))`` - explicit Lexer(std::string input) : - pd_query.h:201 - * - ``char peek(size_t ahead = 1) const`` - char - pd_query.h:101 - :ref:`View ` * - ``Token read_identifier()`` - Token - pd_query.h:116 - * - ``Token read_number()`` - Token - pd_query.h:152 - * - ``Token read_string(char quote)`` - Token - pd_query.h:178 - * - ``void skip_whitespace()`` - void - pd_query.h:110 - * - ``Token tok(TokenType::NUMBER, num)`` - Token - pd_query.h:173 - Code Examples ------------- The following examples are extracted from the test suite. .. _example-lexer-current-0: .. dropdown:: current (pd_test_4_all.cpp:1140) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1130 :emphasize-lines: 11 const std::string& actual) { int _f = 0; pandas_tests::check_str_ws(label, expected, actual, _f); if (_f > 0) throw std::runtime_error(label + ": str mismatch"); } // ---------------------------------------------------------------------------- // Case 1 — dtype.int32_df_nsmallest // ---------------------------------------------------------------------------- void dtype_int32_df_nsmallest() { // Strategy B: synthesize the current (buggy) post-nsmallest state. // Column A is double because int32 is silently promoted inside // pandas::DataFrame::nsmallest today. Column B (from range(10)) stays // int64. Row index labels "2","6","4" are the original positions of the // 3 smallest A values, ties broken by first-occurrence. pandas::DataFrame df; df.add_column("A", std::vector{1.0, 2.0, 3.0}); df.add_column("B", std::vector{2, 6, 4}); df.set_index(std::make_unique>( std::vector{"2", "6", "4"})); apply_default_display(df); .. _example-lexer-peek-1: .. dropdown:: peek (pd_test_5_all.cpp:123633) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 123623 :emphasize-lines: 11 if (!in.good()) return false; std::string field; bool in_quotes = false; char ch; bool any = false; while (in.get(ch)) { any = true; if (in_quotes) { if (ch == '"') { // Lookahead: doubled quote = literal quote. if (in.peek() == '"') { in.get(ch); field.push_back('"'); } else { in_quotes = false; } } else if (ch == '\r') { // Strip CR even inside quotes: the oracle CSV uses \r\n for // newlines inside quoted multiline cells, but `format_*` only // emits \n. Normalise here so byte-equality holds. } else { field.push_back(ch); } } else { if (ch == '"') {