Large Scale Fine-Tuned Transformers Models Application for Business Names Generation

Mantas Lukauskas; Tomas Rasymas; Matas Minelga; Domas Vaitmonas

doi:10.31577/cai_2023_3_525

Authors

Mantas Lukauskas Department of Applied Mathematics, Kaunas University of Technology, LT-44249 Kaunas, Lithuania
Tomas Rasymas Hostinger, UAB, Jonavos st. 60C, LT-44192 Kaunas, Lithuania
Matas Minelga Zyro Inc, UAB, Jonavos st. 60C, LT-44192 Kaunas, Lithuania
Domas Vaitmonas Zyro Inc, UAB, Jonavos st. 60C, LT-44192 Kaunas, Lithuania

DOI:

https://doi.org/10.31577/cai_2023_3_525

Keywords:

Natural language processing, NLP, natural language generation, NLG, transformers

Abstract

Natural language processing (NLP) involves the computer analysis and processing of human languages using a variety of techniques aimed at adapting various tasks or computer programs to linguistically process natural language. Currently, NLP is increasingly applied to a wide range of real-world problems. These tasks can vary from extracting meaningful information from unstructured data, analyzing sentiment, translating text between languages, to generating human-level text autonomously. The goal of this study is to employ transformer-based natural language models to generate high-quality business names. Specifically, this work investigates whether larger models, which require more training time, yield better results for generating relatively short texts, such as business names. To achieve this, we utilize different transformer architectures, including both freely available and proprietary models, and compare their performance. Our dataset comprises 250 928 observations of business names. Based on the perplexity metric, the top-performing model in our study is the GPT2-Medium model. However, our findings reveal a discrepancy between human evaluation and perplexity-based assessment. According to human evaluation, the best results are obtained using the GPT-Neo-1.3B model. Interestingly, the larger GPT-Neo-2.7B model yields poorer results, with its performance not being statistically different from that of the GPT-Neo-125M model, which is 20 times smaller.

Downloads

Download data is not yet available.

Large Scale Fine-Tuned Transformers Models Application for Business Names Generation

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Information

Make a Submission

Keywords