Parameter-efficient fine-tuning of LLMa-2 using quantization low-rank adaptation

Islam, Md. Tariqul

View/Open

23173006_CSE.pdf (1.337Mb)

Date

2024-11

Publisher

BRAC University

Abstract

Llama-2, an advanced neural network with huge potential in text generation, sentiment analysis and language understanding. This report focuses on the fine-tuning process for build chatbot on custom datasets, specification methods, hyperparameters and training strategies. Experimental results on Guanchu datasets show excellent adaptability of the model, outperforming the baseline model in human evaluation and achieving significant BERT scores for help and safety. The analysis includes an in-depth examination of LAMA-2’s architecture, outlining strengths and suggesting areas for improvement. Parameter-efficient fine-tuning and quantization also investigate the transformative potential of LLMA-2 through low-rank adaptation. The objective is to strike a balance between model complexity and efficiency, addressing challenges in resource-constrained environments.

Keywords

Llama-2; Neural network; Hyperparameters; Natural language processing

LC Subject Headings

Neural networks (Computer science).; Natural language processing (Computer science).

Description

This project is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2024.

Cataloged from the PDF version of the project.

Includes bibliographical references (pages 36-38).

Department

Department of Computer Science and Engineering, BRAC University

Type

Project

Collections

Thesis & Report, MSc (Computer Science and Engineering) [88]