Enhancing Smart Contract Security: Static Heuristics and CodeBERT Embeddings

Conference paper


Soofiyan, S. and Karami, A. 2025. Enhancing Smart Contract Security: Static Heuristics and CodeBERT Embeddings. Applied Intelligence and Computing. The Institution of Electronics and Telecommunication Engineers (IETE), Delhi Centre, India 26 - 27 Jul 2025 IEEE.
AuthorsSoofiyan, S. and Karami, A.
TypeConference paper
Abstract

Smart contracts, while foundational to decentralized applications, are susceptible to security vulnerabilities due to their immutable nature, potentially leading to significant financial losses. Existing static analysis tools, such as Slither and Mythril, offer baseline detection but often lack accuracy and scalability for complex contracts. Similarly, emerging deep learning methods show promise but face challenges, including oversimplified multi-class classifications, difficulties processing long code sequences, and the constraint of assigning each contract to a single vulnerability category. To overcome these limitations, we propose a binary classification framework focused on determining whether a contract is secure or possesses at least one known vulnerability. This approach uniquely combines static heuristic features (e.g., control-flow complexity and external call frequency) with contextual semantic embeddings derived from CodeBERT. CodeBERT, a transformer-based model pre-trained on source code, provides rich semantic and syntactic representations that complement static features and enhance detection performance. Evaluating five distinct machine learning models on the SolidiFI and SmartBugs benchmark datasets, we demonstrate that this hybrid strategy significantly enhances detection performance. Notably, our Logistic XGBoost classifier achieves 100\% accuracy, precision, and recall on SolidiFI, although we acknowledge that SolidiFI's relative simplicity may contribute to overly optimistic results and potential overfitting risks. On SmartBugs, ensemble models consistently achieve over 95\% accuracy, indicating strong generalization across more diverse and complex contracts.

Year2025
ConferenceApplied Intelligence and Computing
PublisherIEEE
Accepted author manuscript
License
File Access Level
Anyone
Publication process dates
AcceptedJul 2025
Deposited09 Jul 2025
Journal citationp. In press
Web address (URL) of conference proceedingshttps://ieeexplore.ieee.org/xpl/conhome/1847284/all-proceedings
Web address (URL)https://scrs.in/conference/aic2025
Copyright holder© 2025 IEEE
Copyright informationPersonal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Permalink -

https://repository.uel.ac.uk/item/8zx41

Download files


Accepted author manuscript
paper_2254-AAM.pdf
License: All rights reserved
File access level: Anyone

  • 2
    total views
  • 2
    total downloads
  • 2
    views this month
  • 2
    downloads this month

Export as

Related outputs

Edge computing in big data: challenges and benefits
Karami, A. and Karami, M. 2025. Edge computing in big data: challenges and benefits. International Journal of Data Science and Analytics. p. In press. https://doi.org/10.1007/s41060-025-00855-3
WASPO: Workload-Aware Spark Performance Optimization Using NSGA-II
Karami, A. and Amirhosseini, M. 2025. WASPO: Workload-Aware Spark Performance Optimization Using NSGA-II. Cognitive Models and Artificial Intelligence Conference. Prague-Czech Republic 13 - 14 Jun 2025 IEEE.
Advancing Personality Type Prediction: Utilizing Enhanced Machine and Deep Learning Models with the Myers-Briggs Type Indicator
Amirhosseini, M., Karami, A. and Kalabi, F. 2025. Advancing Personality Type Prediction: Utilizing Enhanced Machine and Deep Learning Models with the Myers-Briggs Type Indicator. Cognitive Models and Artificial Intelligence Conference. Prague-Czech Republic 13 - 14 Jun 2025 IEEE.
Harnessing Social Media Sentiment for Predictive Insights into the Nigerian Presidential Election
Alao, J. O., Amirhosseini, M., Karami, A. and Ghorashi, S. A. 2025. Harnessing Social Media Sentiment for Predictive Insights into the Nigerian Presidential Election. Cognitive Models and Artificial Intelligence Conference. Prague-Czech Republic 13 - 14 Jun 2025 IEEE.
AI-Driven Mortality Prediction in COVID-19 Patients Using Advanced Feature Selection
Rajakaruna, I., Amirhosseini, M., Li, Y., Karami, A. and Arachchillage, D. J. 2025. AI-Driven Mortality Prediction in COVID-19 Patients Using Advanced Feature Selection. Cognitive Models and Artificial Intelligence Conference. Prague-Czech Republic 13 - 14 Jun 2025 IEEE.
Harmony in Federated Learning: A Comprehensive Review of Techniques to Tackle Heterogeneity and Non-IID Data
Karami, M. and Karami, A. 2025. Harmony in Federated Learning: A Comprehensive Review of Techniques to Tackle Heterogeneity and Non-IID Data. Cluster Computing. p. In press.
The impact of big data characteristics on credit risk assessment
Karami, A. and Igbokwe, C. 2025. The impact of big data characteristics on credit risk assessment. International Journal of Data Science and Analytics. p. In press. https://doi.org/10.1007/s41060-025-00753-8
Ethereum Smart Contracts: A Hierarchical Analysis of Vulnerability Challenges and Mitigation Strategies
Soofiyan, S. and Karami, A. 2025. Ethereum Smart Contracts: A Hierarchical Analysis of Vulnerability Challenges and Mitigation Strategies. Cluster Computing. p. In press.
Leveraging Big Data Characteristics for Enhanced Healthcare Fraud Detection
Karami, A. and Jafari, F. 2025. Leveraging Big Data Characteristics for Enhanced Healthcare Fraud Detection. Cluster Computing. 28 (Art. 349). https://doi.org/10.1007/s10586-024-05097-9
Breaking Down SEO Complexity: Bridging PCA and Bayesian-Optimized t-SNE
Karami, A., Ghasemabadi, S. F. and Amirhosseini, M. 2024. Breaking Down SEO Complexity: Bridging PCA and Bayesian-Optimized t-SNE. 2024 IEEE International Conference on Big Knowledge (ICBK). IEEE. https://doi.org/10.1109/ICKG63256.2024.00028
Exploring the Ethical Implications of AI-Powered Personalization in Digital Marketing
Karami, A., Shemshaki, M. and Ghazanfar, M. 2024. Exploring the Ethical Implications of AI-Powered Personalization in Digital Marketing. Data Intelligence. p. In Press. https://doi.org/10.3724/2096-7004.di.2024.0055
Prediction of Depression Severity and Personalised Risk Factors Using Machine Learning on Multimodal Data
Amirhosseini, M. H., Ayodele, A. L. and Karami, A. 2024. Prediction of Depression Severity and Personalised Risk Factors Using Machine Learning on Multimodal Data. IS'24: 12th IEEE International Conference on Intelligent Systems. Varna, Bulgaria 29 - 31 Aug 2024 IEEE. https://doi.org/10.1109/IS61756.2024.10705185
Large-Scale Music Genre Analysis and Classification Using Machine Learning with Apache Spark
Chaudhury, M., Karami, A. and Ghazanfar, M. A. 2022. Large-Scale Music Genre Analysis and Classification Using Machine Learning with Apache Spark. Electronics. 11 (16), p. 2567. https://doi.org/10.3390/electronics11162567
Designing a Cost-Efficient Network for a Small Enterprise
Jafari, F., Karami, A. and Osemwengie, L. 2021. Designing a Cost-Efficient Network for a Small Enterprise. SAI Computing Conference 2021. Online 15 - 16 Jul 2021 Springer, Cham. https://doi.org/10.1007/978-3-030-80119-9_14
Stock market prediction using machine learning classifiers and social media, news
Khan, W., Ghazanfar, M., Azam, M. A., Karami, A., Alyoubi, K. H. and Alfakeeh, A. S. 2020. Stock market prediction using machine learning classifiers and social media, news. Journal of Ambient Intelligence and Humanized Computing. 13, pp. 3433-3456. https://doi.org/10.1007/s12652-020-01839-w
A novel centroids initialisation for K-means clustering in the presence of benign outliers
Karami, A., Ur Rehman, S. and Ghazanfar, M. 2020. A novel centroids initialisation for K-means clustering in the presence of benign outliers. International Journal of Data Analysis Techniques and Strategies. 12 (4), pp. 287-298. https://doi.org/10.1504/IJDATS.2020.111498
An Anomaly-based Intrusion Detection System in Presence of Benign Outliers with Visualization Capabilities
Karami, A. 2018. An Anomaly-based Intrusion Detection System in Presence of Benign Outliers with Visualization Capabilities. Expert Systems with Applications. 108, pp. 36-60. https://doi.org/10.1016/j.eswa.2018.04.038
Functional Connectivity Evaluation for Infant EEG Signals based on Artificial Neural Network
Sharif, M., Naeem, U., Islam, S. and Karami, A. 2018. Functional Connectivity Evaluation for Infant EEG Signals based on Artificial Neural Network. Arai, Kohei, Kapoor, Supriya and Bhatia, Rahul (ed.) Intelligent Systems Conference (IntelliSys) 2018. London, UK 06 - 07 Sep 2018 Springer, Cham. https://doi.org/10.1007/978-3-030-01057-7_34
The Application of a Semantic-Based Process Mining Framework on a Learning Process Domain
Okoye, Kingsley, Islam, S., Naeem, U., Sharif, M., Azam, Muhammad Awais and Karami, A. 2018. The Application of a Semantic-Based Process Mining Framework on a Learning Process Domain. Arai, Kohei, Kapoor, Supriya and Bhatia, Rahul (ed.) Intelligent Systems Conference (IntelliSys) 2018. London, UK 06 - 07 Sep 2018 Springer, Cham. https://doi.org/10.1007/978-3-030-01054-6_96
A Framework for Uncertainty-Aware Visual Analytics in Big Data
Karami, A. 2015. A Framework for Uncertainty-Aware Visual Analytics in Big Data. CEUR Workshop Proceedings. 1510, pp. 146-155.
Utilization of multi attribute decision making techniques to integrate automatic and manual ranking of options
Karami, A. and Johansson, Ronnie 2013. Utilization of multi attribute decision making techniques to integrate automatic and manual ranking of options. Journal of Information Science and Engineering. 30 (2), pp. 519-534.
Choosing DBSCAN parameters automatically using differential evolution
Karami, A. and Johansson, Ronnie 2014. Choosing DBSCAN parameters automatically using differential evolution. International Journal of Computer Applications. 91 (7), pp. 1-11. https://doi.org/10.5120/15890-5059
A fuzzy anomaly detection system based on hybrid PSO-Kmeans algorithm in content-centric networks
Karami, A. and Guerrero-Zapata, Manel 2014. A fuzzy anomaly detection system based on hybrid PSO-Kmeans algorithm in content-centric networks. Neurocomputing. 149 (Part C), pp. 1253-1269. https://doi.org/10.1016/j.neucom.2014.08.070
A hybrid multiobjective RBF-PSO method for mitigating DoS attacks in Named Data Networking
Karami, A. and Guerrero-Zapata, Manel 2014. A hybrid multiobjective RBF-PSO method for mitigating DoS attacks in Named Data Networking. Neurocomputing. 151 (3), pp. 1262-1282. https://doi.org/10.1016/j.neucom.2014.11.003
An ANFIS-based cache replacement method for mitigating cache pollution attacks in Named Data Networking
Karami, A. and Guerrero-Zapata, Manel 2015. An ANFIS-based cache replacement method for mitigating cache pollution attacks in Named Data Networking. Computer Networks. 80 (April), pp. 51-65. https://doi.org/10.1016/j.comnet.2015.01.020
ACCPndn: Adaptive Congestion Control Protocol in Named Data Networking by learning capacities using optimized Time-Lagged Feedforward Neural Network
Karami, A. 2015. ACCPndn: Adaptive Congestion Control Protocol in Named Data Networking by learning capacities using optimized Time-Lagged Feedforward Neural Network. Journal of Network and Computer Applications. 56 (Oct.), pp. 1-18. https://doi.org/10.1016/j.jnca.2015.05.017
A Wormhole Attack Detection and Prevention Technique in Wireless Sensor Networks
Siddiqui, A., Karami, A. and Johnson, M. O. 2017. A Wormhole Attack Detection and Prevention Technique in Wireless Sensor Networks. International Journal of Computer Applications. 174 (Art. 4). https://doi.org/10.5120/ijca2017915376