Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](bug) Resolve the crash issue during string hash computation #48580

Closed
wants to merge 1 commit into from

Conversation

felixwluo
Copy link
Contributor

What problem does this PR solve?

Problem Summary:

  1. be core
(gdb) bt
#0  doris::CRC32Hash::operator() (this=<optimized out>, x=...) at /root/be/src/vec/common/string_ref.h:391
#1  phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::HashElement::operator()<doris::StringRef, std::piecewise_construct_t const&, std::tuple<doris::StringRef const&>, std::tuple<char* const&> > (this=<optimized out>, key=...)
    at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:1867
#2  phmap::priv::memory_internal::DecomposePairImpl<phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::HashElement, doris::StringRef const&, std::tuple<char* const&> > (f=..., p=...) at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:751
#3  phmap::priv::DecomposePair<phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::HashElement, std::pair<doris::StringRef const, char*>&> (f=..., args=...) at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:4119
#4  phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>::apply<phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::HashElement, std::pair<doris::StringRef const, char*>&> (f=..., args=...)
    at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:4222
#5  phmap::priv::hash_policy_traits<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, void>::apply<phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::HashElement, std::pair<doris::StringRef const, char*>&, phmap::priv::FlatHashMapPolicy<doris::StringRef, char*> > (f=..., ts=...) at /var/local/thirdparty/installed/include/parallel_hashmap/phmap_base.h:548
#6  phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::resize (this=0x560c564efe40, 
    new_capacity=<optimized out>) at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:2019
#7  0x0000560957eba1b2 in phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::prepare_insert (
    this=0x560c564efe40, hashval=4346916887181955022) at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:2198
#8  0x0000560957eba0e9 in phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::find_or_prepare_insert<doris::StringRef> (this=0x0, key=..., hashval=94595109437455) at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:2186
#9  0x0000560959640016 in phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<doris::StringRef, char*>, DefaultHash<doris::StringRef, void>, phmap::EqualTo<doris::StringRef>, doris::vectorized::Allocator_<std::pair<doris::StringRef const, char*> > >::lazy_emplace_with_hash<doris::StringRef, PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false>::lazy_emplace<doris::vectorized::ArenaKeyHolder&, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&, std::pair<doris::StringRef const, char*>*&, unsigned long, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&)::{lambda(auto:1 const&)#1}>(doris::StringRef const&, unsigned long, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&) (this=0x560c564efe40, key=..., hashval=94595109437455, f=...)
--Type <RET> for more, q to quit, c to continue without paging--
    at /var/local/thirdparty/installed/include/parallel_hashmap/phmap.h:1534
#10 PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false>::lazy_emplace<doris::vectorized::ArenaKeyHolder&, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&, std::pair<doris::StringRef const, char*>*&, unsigned long, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&) (this=0x560c564efe40, key_holder=..., hash_value=94595109437455, it=<optimized out>, f=...)
    at /root/be/src/vec/common/hash_table/ph_hash_map.h:177
#11 doris::vectorized::ColumnsHashing::columns_hashing_impl::HashMethodBase<doris::vectorized::ColumnsHashing::HashMethodSerialized<std::pair<doris::StringRef const, char*>, char*, true>, std::pair<doris::StringRef const, char*>, char*, false>::lazy_emplace_impl<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false>, doris::vectorized::ArenaKeyHolder, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&>(doris::vectorized::ArenaKeyHolder&, unsigned long, doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&) (key_holder=..., 
    hash_value=94595109437455, data=..., this=<optimized out>, f=...) at /root/be/src/vec/common/columns_hashing_impl.h:316
#12 doris::vectorized::ColumnsHashing::columns_hashing_impl::HashMethodBase<doris::vectorized::ColumnsHashing::HashMethodSerialized<std::pair<doris::StringRef const, char*>, char*, true>, std::pair<doris::StringRef const, char*>, char*, false>::lazy_emplace_key<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false>, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&, unsigned long, unsigned long, doris::vectorized::Arena&, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const::{lambda(auto:1 const&, auto:2 const&)#1}&) (data=..., hash_value=94595109437455, row=0, 
    pool=..., this=<optimized out>, f=...) at /root/be/src/vec/common/columns_hashing_impl.h:155
#13 doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0::operator()<doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) const (this=0x7f5ff0c940e0, agg_method=...) at /root/be/src/vec/exec/vaggregation_node.cpp:997
#14 std::__invoke_impl<void, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0, doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(std::__invoke_other, doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0&&, doris::vectorized::A--Type <RET> for more, q to quit, c to continue without paging--
ggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) (__f=..., __args=...)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
#15 std::__invoke<doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0, doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&>(doris::vectorized::AggregationNode::_emplace_into_hash_table(char**, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned long)::$_0&&, doris::vectorized::AggregationMethodSerialized<PHHashMap<doris::StringRef, char*, DefaultHash<doris::StringRef, void>, false> >&) (__fn=..., __args=...)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96
#16 _ZNSt8__detail9__variant17__gen_vtable_implINS0_12_Multi_arrayIPFNS0_21__deduce_visit_resultIvEEOZN5doris10vectorized15AggregationNode24_emplace_into_hash_tableEPPcRSt6vectorIPKNS6_7IColumnESaISD_EEmE3$_0RSt7variantIJNS6_27AggregationMethodSerializedI9PHHashMapINS5_9StringRefES8_11DefaultHashISM_vELb0EEEENS6_26AggregationMethodOneNumberIh12FixedHashMapIhS8_28FixedHashMapImplicitZeroCellIhS8_16HashTableNoStateE28FixedHashTableCalculatedSizeISV_E9AllocatorILb1ELb1ELb0EEELb0EEENSR_ItSS_ItS8_ST_ItS8_SU_E24FixedHashTableStoredSizeIS12_ESZ_ELb0EEENSR_IjSL_IjS8_9HashCRC32IjELb0EELb0EEENSR_ImSL_ImS8_S17_ImELb0EELb0EEENS6_30AggregationMethodStringNoCacheI13StringHashMapIS8_SZ_EEENSR_INS6_7UInt128ESL_IS1I_S8_S17_IS1I_ELb0EELb0EEENSR_IjSL_IjS8_14HashMixWrapperIjS18_ELb0EELb0EEENSR_ImSL_ImS8_S1M_ImS1B_ELb0EELb0EEENSR_IS1I_SL_IS1I_S8_S1M_IS1I_S1J_ELb0EELb0EEENS6_37AggregationMethodSingleNullableColumnINSR_IhNS6_26AggregationDataWithNullKeyIS10_EELb0EEEEENS1W_INSR_ItNS1X_IS15_EELb0EEEEENS1W_INSR_IjNS1X_IS19_EELb0EEEEENS1W_INSR_ImNS1X_IS1C_EELb0EEEEENS1W_INSR_IjNS1X_IS1O_EELb0EEEEENS1W_INSR_ImNS1X_IS1R_EELb0EEEEENS1W_INSR_IS1I_NS1X_IS1K_EELb0EEEEENS1W_INSR_IS1I_NS1X_IS1U_EELb0EEEEENS1W_INS1E_INS1X_IS1G_EEEEEENS6_26AggregationMethodKeysFixedIS1C_Lb0EEENS2P_IS1C_Lb1EEENS2P_IS1K_Lb0EEENS2P_IS1K_Lb1EEENS2P_ISL_INS6_7UInt256ES8_S17_IS2U_ELb0EELb0EEENS2P_IS2W_Lb1EEENS2P_IS1R_Lb0EEENS2P_IS1R_Lb1EEENS2P_IS1U_Lb0EEENS2P_IS1U_Lb1EEENS2P_ISL_IS2U_S8_S1M_IS2U_S2V_ELb0EELb0EEENS2P_IS34_Lb1EEEEEEJEEESt16integer_sequenceImJLm0EEEE14__visit_invokeESI_S38_ (__visitor=..., __vars=...)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013
#17 0x000056095969b7d7 in _ZSt10__do_visitINSt8__detail9__variant21__deduce_visit_resultIvEEZN5doris10vectorized15AggregationNode24_emplace_into_hash_tableEPPcRSt6vectorIPKNS5_7IColumnESaISC_EEmE3$_0JRSt7variantIJNS5_27AggregationMethodSerializedI9PHHashMapINS4_9StringRefES7_11DefaultHashISK_vELb0EEEENS5_26AggregationMethodOneNumberIh12FixedHashMapIhS7_28FixedHashMapImplicitZeroCellIhS7_16HashTableNoStateE28FixedHashTableCalculatedSizeIST_E9AllocatorILb1ELb1ELb0EEELb0EEENSP_ItSQ_ItS7_SR_ItS7_SS_E24FixedHashTableStoredSizeIS10_ESX_ELb0EEENSP_IjSJ_IjS7_9HashCRC32IjELb0EELb0EEENSP_ImSJ_ImS7_S15_ImELb0EELb0EEENS5_30AggregationMethodStringNoCacheI13StringHashMapIS7_SX_EEENSP_INS5_7UInt128ESJ_IS1G_S7_S15_IS1G_ELb0EELb0EEENSP_IjSJ_IjS7_14HashMixWrapperIjS16_ELb0EELb0EEENSP_ImSJ_ImS7_S1K_ImS19_ELb0EELb0EEENSP_IS1G_SJ_IS1G_S7_S1K_IS1G_S1H_ELb0EELb0EEENS5_37AggregationMethodSingleNullableColumnINSP_IhNS5_26AggregationDataWithNullKeyISY_EELb0EEEEENS1U_INSP_ItNS1V_IS13_EELb0EEEEENS1U_INSP_IjNS1V_IS17_EELb0EEEEENS1U_INSP_ImNS1V_IS1A_EELb0EEEEENS1U_INSP_IjNS1V_IS1M_EELb0EEEEENS1U_INSP_ImNS1V_IS1P_EELb0EEEEENS1U_INSP_IS1G_NS1V_IS1I_EELb0EEEEENS1U_INSP_IS1G_NS1V_IS1S_EELb0EEEEENS1U_INS1C_INS1V_IS1E_EEEEEENS5_26AggregationMethodKeysFixedIS1A_Lb0EEENS2N_IS1A_Lb1EEENS2N_IS1I_Lb0EEENS2N_IS1I_Lb1EEENS2N_ISJ_INS5_7UInt256ES7_S15_IS2S_ELb0EELb0EEENS2N_IS2U_Lb1EEENS2N_IS1P_Lb0EEENS2N_IS1P_Lb1EEENS2N_IS1S_Lb0EEENS2N_IS1S_Lb1EEENS2N_ISJ_IS2S_S7_S1K_IS2S_S2T_ELb0EELb0EEENS2N_IS32_Lb1EEEEEEEDcOT0_DpOT1_ (__visitor=..., __variants=...)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1714
#18 _ZSt5visitIZN5doris10vectorized15AggregationNode24_emplace_into_hash_tableEPPcRSt6vectorIPKNS1_7IColumnESaIS8_EEmE3$_0JRSt7variantIJNS1_27AggregationMethodSerializedI9PHHashMapINS0_9StringRefES3_11DefaultHashISG_vELb0EEEENS1_26AggregationMethodOneNumberIh12FixedHashMapIhS3_28FixedHashMapImplicitZeroCellIhS3_16HashTableNoStateE28FixedHashTableCalculatedSizeISP_E9AllocatorILb1ELb1ELb0EEELb0EEENSL_ItSM_ItS3_SN_ItS3_SO_E24FixedHashTableStoredSizeISW_EST_ELb0EEENSL_IjSF_IjS3_9HashCRC32IjELb0EELb0EEENSL_ImSF_ImS3_S11_ImELb0EELb0EEENS1_30AggregationMethodStringNoCacheI13StringHashMapIS3_ST_EEENSL_INS1_7UInt128ESF_IS1C_S3_S11_IS1C_ELb0EELb0EEENSL_IjSF_IjS3_14HashMixWrapperIjS12_ELb0EELb0EEENSL_ImSF_ImS3_S1G_ImS15_ELb0EELb0EEENSL_IS1C_SF_IS1C_S3_S1G_IS1C_S1D_ELb0EELb0EEENS1_37AggregationMethodSingleNullableColumnINSL_IhNS1_26AggregationDataWithNullKeyISU_EELb0EEEEENS1Q_INSL_ItNS1R_ISZ_EELb0EEEEENS1Q_INSL_IjNS1R_IS13_EELb0EEEEENS1Q_INSL_ImNS1R_IS16_EELb0EEEEENS1Q_INSL_IjNS1R_IS1I_EELb0EEEEENS1Q_INSL_ImNS1R_IS1L_EELb0EEEEENS1Q_INSL_IS1C_NS1R_IS1E_EELb0EEEEENS1Q_INSL_IS1C_NS1R_IS1O_EELb0EEEEENS1Q_INS18_INS1R_IS1A_EEEEEENS1_26AggregationMethodKeysFixedIS16_Lb0EEENS2J_IS16_Lb1EEENS2J_IS1E_Lb0EEENS2J_IS1E_Lb1EEENS2J_ISF_INS1_7UInt256ES3_S11_IS2O_ELb0EELb0EEENS2J_IS2Q_Lb1EEENS2J_IS1L_Lb0EEENS2J_IS1L_Lb1EEENS2J_IS1O_Lb0EEENS2J_IS1O_Lb1EEENS2J_ISF_IS2O_S3_S1G_IS2O_S2P_ELb0EELb0EEENS2J_IS2Y_Lb1EEEEEEEDcOT_DpOT0_ (__visitor=..., __variants=...)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1764
#19 doris::vectorized::AggregationNode::_emplace_into_hash_table (this=0x560ae7fbd200, places=<optimized out>, key_columns=..., num_rows=1)
--Type <RET> for more, q to quit, c to continue without paging--
    at /root/be/src/vec/exec/vaggregation_node.cpp:924
#20 doris::vectorized::AggregationNode::_merge_with_serialized_key_helper<false, false> (this=0x560ae7fbd200, block=0x560bec646cf0)
    at /root/be/src/vec/exec/vaggregation_node.h:1105
#21 0x000056095962ea04 in doris::vectorized::AggregationNode::_merge_with_serialized_key (this=0x80, block=0x5608a3e5400f)
    at /root/be/src/vec/exec/vaggregation_node.cpp:1662
#22 0x00005609596daefb in std::__invoke_impl<doris::Status, doris::Status (doris::vectorized::AggregationNode::*&)(doris::vectorized::Block*), doris::vectorized::AggregationNode*&, doris::vectorized::Block*> (__f=<optimized out>, __t=<optimized out>, __args=<optimized out>)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74
#23 std::__invoke_r<doris::Status, doris::Status (doris::vectorized::AggregationNode::*&)(doris::vectorized::Block*), doris::vectorized::AggregationNode*&, doris::vectorized::Block*> (__fn=<optimized out>, __args=<optimized out>, __args=<optimized out>)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114
#24 std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)>::__call<doris::Status, doris::vectorized::Block*&&, 0ul, 1ul>(std::tuple<doris::vectorized::Block*&&>&&, std::_Index_tuple<0ul, 1ul>) (this=<optimized out>, __args=...)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:570
#25 std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)>::operator()<doris::vectorized::Block*>(doris::vectorized::Block*&&) (this=<optimized out>, 
    __args=<optimized out>) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:629
#26 std::__invoke_impl<doris::Status, std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)>&, doris::vectorized::Block*>(std::__invoke_other, std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)>&, doris::vectorized::Block*&&) (__f=..., __args=<optimized out>)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
#27 std::__invoke_r<doris::Status, std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)>&, doris::vectorized::Block*>(std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)>&, doris::vectorized::Block*&&) (__fn=..., __args=<optimized out>)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114
#28 std::_Function_handler<doris::Status (doris::vectorized::Block*), std::_Bind_result<doris::Status, doris::Status (doris::vectorized::AggregationNode::*(doris::vectorized::AggregationNode*, std::_Placeholder<1>))(doris::vectorized::Block*)> >::_M_invoke(std::_Any_data const&, doris::vectorized::Block*&&) (__functor=..., __args=<optimized out>)
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
#29 0x00005609596309f9 in std::function<doris::Status (doris::vectorized::Block*)>::operator()(doris::vectorized::Block*) const (this=0x80, 
    __args=0x560bec646cf0) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
#30 doris::vectorized::AggregationNode::sink (this=0x560ae7fbd200, state=<optimized out>, in_block=0x560bec646cf0, eos=false)
    at /root/be/src/vec/exec/vaggregation_node.cpp:602
#31 0x000056095cf2dac0 in doris::pipeline::StreamingOperator<doris::pipeline::AggSinkOperatorBuilder>::sink (this=<optimized out>, 
    state=0x5608a3e5400f, in_block=0xc6a4a7935bd1e995, source_state=<optimized out>) at /root/be/src/pipeline/exec/operator.h:336
#32 0x000056095cf5200c in doris::pipeline::PipelineTask::execute (this=0x560b73e00580, eos=0x7f5ff0c944d7)
    at /root/be/src/pipeline/pipeline_task.cpp:282
#33 0x000056095cf5acea in doris::pipeline::TaskScheduler::_do_work (this=0x560966cd8770, index=11)
    at /root/be/src/pipeline/task_scheduler.cpp:268
#34 0x000056095578f33f in doris::ThreadPool::dispatch_thread (this=0x56096a246700) at /root/be/src/util/threadpool.cpp:533
#35 0x00005609557852bc in std::function<void ()>::operator()() const (this=0x0)
--Type <RET> for more, q to quit, c to continue without paging--
    at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
#36 doris::Thread::supervise_thread (arg=0x56096ac717a0) at /root/be/src/util/thread.cpp:498
#37 0x00007f6081c5eea5 in start_thread () from /lib64/libpthread.so.0
#38 0x00007f608268d9fd in clone () from /lib64/libc.so.6
  1. problem
    When handling aggregate queries, the BE server may crash. The stack trace shows that the crash occurs in the doris::CRC32Hash::operator() function when calculating the hash value of a StringRef object. This issue is more likely to be triggered when the size of the StringRef is not a multiple of 8 (e.g., 15 bytes).

  2. cause analysis
    through GDB debugging, we found:
    3-1. the crash occurs during the hash table resize process, when calculating the hash value of a StringRef
    3-2. he issue is that the size of the StringRef is 15 bytes, and the data pointer is valid
    doris::vectorized::UInt64 word = unaligned_load<doris::vectorized::UInt64>(end - 8);
    when the size of the StringRef is 15, the code attempts to read 8 bytes starting from data+7. Although these bytes exist in memory, such unaligned memory access may cause a crash on some architectures

  3. GDB

0x5609af33c000: 0xea    0xa1    0xae    0x0d    0x00    0x00    0x00    0x00
0x5609af33c008: 0x03    0x00    0x00    0x00    0x55    0x54    0x41    0x00

the 8 bytes starting from data+7 span across two memory lines, which may cause unaligned access issues

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 3, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@felixwluo
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/16) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 38.55% (8310/21555)
Line Coverage 30.25% (68738/227216)
Region Coverage 29.67% (35393/119279)
Branch Coverage 25.43% (18193/71532)

@felixwluo felixwluo marked this pull request as draft March 4, 2025 04:33
@felixwluo felixwluo closed this Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants