Security Knowledge Distillation for Enhancing LLM Safety in Code Generation
pdf

Keywords

knowledge distillation
secure coding
vulnerability categories
LLM compression
CWE mitigation

Abstract

We propose a distillation framework that transfers secure coding expertise into a compact model trained using expert demonstrations, static-analysis labels, and secure patterns. The distilled model (7B parameters) achieves 83% security compliance, nearly matching GPT-4’s 89% performance while requiring significantly less compute. Evaluated across CWE-20, CWE-78, CWE-89, the distilled model reduces injection vulnerabilities by 62% relative to baseline models. This work demonstrates that targeted distillation can teach smaller models to respect security best practices. 

pdf

References

Li, T., Jiang, Y., Hong, E., & Liu, S. (2025). Organizational Development in High-Growth Biopharmaceutical Companies: A Data-Driven Approach to Talent Pipeline and Competency Modeling.

ISLAM, M. R. (2025). How Developers Evaluated LLM-Generated Code and How They Debug the LLM-Generated Code.

Gu, X., Tian, X., Yang, J., & Liu, M. (2025). Building and Performance Validation of a Digital Twin Regulatory Framework for Financial Compliance and Market Transparency. Available at SSRN 5839782.

Mohsin, A., Janicke, H., Wood, A., Sarker, I. H., Maglaras, L., & Janjua, N. (2024). Can we trust large language models generated code? a framework for in-context learning, security patterns, and code evaluations across diverse llms. arXiv preprint arXiv:2406.12513.

Qin, F., Cheng, H. Y., Sneeringer, R., Vlachostergiou, M., Acharya, S., Liu, H., ... & Yao, L. (2021, May). ExoForm: Shape memory and self-fusing semi-rigid wearables. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-8).

Rasel, H., Didar, A. B. S., Dinar, A. A. M., Fahad, F. I., Khan, M. A. J., & Biplob, B. (2025). A Comprehensive Study of LLM and Evolution, Varieties, and Their Role in Software Engineering and Cybersecurity.

Tan, L., Liu, D., Liu, X., Wu, W., & Jiang, H. (2025). Efficient Grey Wolf Optimization: A High-Performance Optimizer with Reduced Memory Usage and Accelerated Convergence.

Tanveer, F., Iradat, F., Iqbal, W., & Ahmad, A. (2025). Towards Secure APIs: A Survey on RESTful API Vulnerability Detection. Computers, Materials & Continua, 84(3).

Fleischer, M., Das, D., Bose, P., Bai, W., Lu, K., Payer, M., ... & Vigna, G. (2023). {ACTOR}:{Action-Guided} Kernel Fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23) (pp. 5003-5020).

Abbassi, A. A., Da Silva, L., Nikanjam, A., & Khomh, F. (2025). Unveiling inefficiencies in llm-generated code: Toward a comprehensive taxonomy. arXiv preprint arXiv:2503.06327.

Bai, W., & Wu, Q. (2023). Towards more effective responsible disclosure for vulnerability research. Proc. of EthiCS.

Siavvas, M., Tsoukalas, D., Kalouptsoglou, I., Manganopoulou, E., Manolis, G., Kehagias, D., & Tzovaras, D. (2023). Security monitoring during software development: An industrial case study. Applied Sciences, 13(12), 6872.

Nian, J., Yang, M., Gao, X., Liu, H., Fang, F., Cheng, L., & Wu, X. (2025). RPFF-PA: Reliable and Parallel Fault-tolerant Framework for Path Latency Reduction Deployed in Register Arrays. ACM Transactions on Embedded Computing Systems.

Sanjalawe, Y., Allehyani, B., & Al-E’mari, S. (2025). A Context-Aware Lightweight Framework for Source Code Vulnerability Detection. Future Internet, 17(12), 557.

Sheu, J. B., & Gao, X. Q. (2014). Alliance or no alliance—Bargaining power in competing reverse supply chains. European Journal of Operational Research, 233(2), 313-325.

Gholami, S., & Omar, M. (2024). Can a student large language model perform as well as its teacher?. In Innovations, Securities, and Case Studies Across Healthcare, Business, and Technology (pp. 122-139). IGI Global Scientific Publishing.

Du, Y. (2025). Research on Deep Learning Models for Forecasting Cross-Border Trade Demand Driven by Multi-Source Time-Series Data. Journal of Science, Innovation & Social Impact, 1(2), 63-70.

Islam, N. T., Bethany, M., Manuel, D., Jadliwala, M., & Najafirad, P. (2024). Unintentional Security Flaws in Code: Automated Defense via Root Cause Analysis. arXiv preprint arXiv:2409.00199.

Mao, Y., Ma, X., & Li, J. (2025). Research on API Security Gateway and Data Access Control Model for Multi-Tenant Full-Stack Systems.

Zin, M. M., Nguyen, H. T., Satoh, K., Sugawara, S., & Nishino, F. (2023, June). Improving translation of case descriptions into logical fact formulas using legalcasener. In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law (pp. 462-466).

Mao, Y., Ma, X., & Li, J. (2025). Research on Web System Anomaly Detection and Intelligent Operations Based on Log Modeling and Self-Supervised Learning.

Vemuri, S. M., Gundrapally, A., Xia, Z., Kim, J., & Choi, K. K. (2026). CB-DistillGrad: Class-Balanced Distillation and Gradient Conflict Resolution for Low Power Edge AI Applications. IEEE Access, 14, 3040-3061.

Liu, S., Feng, H., & Liu, X. (2025). A Study on the Mechanism of Generative Design Tools' Impact on Visual Language Reconstruction: An Interactive Analysis of Semantic Mapping and User Cognition. Authorea Preprints.

Germano, L. B., Goldschmidt, R. R., Noya, R. C., & Duarte, J. C. (2025). A Systematic Review on Detection, Repair, and Explanation of Vulnerabilities in Source Code Using Large Language Models. IEEE Access, 13, 192263-192293.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright (c) 2026 Daniel Thompson, Sarah Mitchell, Jason Li (Author)