KatzBot is a cutting-edge solution designed to enhance communication within university communities. Leveraging the Katz generative pre-trained transformer (KatzGPT), a sophisticated custom large language model (LLM), KatzBot addresses the precision gaps often observed in existing academic chatbot systems. By utilizing two meticulously curated datasets comprising sentence-completion and question-answer pairs, KatzGPT is trained to expand its knowledge base and improve its accuracy. This innovative system offers a concise and effective interface, facilitating seamless communication between users and the KatzGPT model, thereby revolutionizing the academic chatbot experience.
Data Type | Description | Count |
---|---|---|
Sentence Completion | Training for knowledge integration | 6,280 |
Train QA Pairs | Enhancing detailed understanding | 7,334 |
Test QA Pairs | Assessing model's consistency | 2,081 |
In our comparative analysis detailed in the table below, we assess the performance of several Large Language Models (LLMs), highlighting our in-house developed KatzGPT model. This evaluation benchmarks the Rouge Scores, focusing on particularly Rouge-L, which measures the long-form coherence of generated texts.
Model | Rouge-1 | Rouge-2 | Rouge-L |
---|---|---|---|
Llama2 3B | 0.23 | 0.07 | 0.20 |
Microsoft Phi 1.5 | 0.26 | 0.10 | 0.24 |
Llama2 7B | 0.28 | 0.12 | 0.25 |
KatzGPT | 0.29 | 0.16 | 0.25 |
Microsoft Phi2 | 0.34 | 0.15 | 0.31 |
Mistral 7B Instruct | 0.43 | 0.20 | 0.33 |
GPT-2 | 0.45 | 0.32 | 0.43 |
KatzGPT, leveraging the foundational architecture of GPT, showcases notable performance, especially in terms of Rouge-L, where it matches the scores of advanced models such as Llama2 7B, indicating its robust capability in understanding and reproducing the context and structure of the source texts. Despite being a newcomer, KatzGPT outstrips many predecessors including Microsoft's Phi 1.5 and Llama2 models, underscoring its exceptional grasp of nuanced textual relationships. Notably, GPT-2 stands out as the leading model, affirming its influence and effectiveness as the base for KatzGPT development. Its top-tier performance across all Rouge metrics solidifies its position as a pivotal benchmark in the landscape of LLMs. While Mistral 7B Instruct leads in extractive and instructive capabilities, it still trails behind GPT-2, highlighting the competitive nature of these models in different aspects of language comprehension and generation. The under-performance of Llama2 3B across all metrics suggests areas requiring further refinement, particularly in enhancing its understanding and response generation capabilities. Our evaluation not only highlights KatzGPT's ability to preserve coherence and context in its outputs but also establishes its status as a formidable rival to more established models. The comparative study illuminates the varying capabilities and potential areas for enhancement among top Large Language Models, positioning KatzGPT as an impactful newcomer in text generation and understanding. With additional training and some adjustments to its architectural design, KatzGPT has the potential to achieve even greater performance.