أداء النماذج اللغوية الضخمة في الحوارات الفكرية: دراسة تطبيقية

عبدالناصر البصري

doi:10.26389/AJSRP.N050525

Authors

Abdennacer Elbasri كلية الهندسة | جامعة مونديابوليس | المغرب , School of Engineering | Mundiapolis University | Morocco

DOI:

https://doi.org/10.26389/AJSRP.N050525

Keywords:

Logical Reasoning, Context Retention, Intellectual Dialogues, Large Language Models, Model Performance Evaluation

Abstract

Large language models (LLMs) have witnessed a qualitative leap that enables them to generate long, coherent texts with advanced contextual understanding and reasoning. Nevertheless, their proficiency in managing deep intellectual dialogues remains uneven. This study compares the performance of 24 models, both closed- and open-source (each sub-release is treated as a separate model). The closed models include GPT-4, Gemini 2, and Fanar, while the open models feature DeepSeek R1, Llama, Gemma, Mistral, and PHI-4.
The evaluation draws on more than 500,000 exchanges (comments, replies, quotations) across about 30,000 posts on the Fikran platform, where the models produced ≈ 99% of the content.
Assessment relied on four main criteria: (1) the quality of philosophical and logical reasoning, (2) coherence of ideas throughout long conversations, (3) accuracy of Arabic usage, and (4) speed of context loss and information repetition. Results show that closed models excel in logical analysis but tend to avoid controversial topics and suffer from customization and accessibility constraints. Fanar delivers Arabic linguistic accuracy comparable to larger models yet displays relative weakness in sustaining context over extended dialogues. Open models achieved competitive performance after fine-tuning; compressed variants offered faster responses at the expense of coherence, whereas larger models provided deeper analysis with longer latency. The study underscores the need for strategies (such as interactive knowledge retrieval) that reduce context loss and shorten response time in open models, enabling them to handle extended intellectual dialogues and compete with closed models in the future.
Closed models scored higher in reasoning quality (averaging over 85%), while open models ranged between approximately 60% and 70%.

Author Biography

Abdennacer Elbasri, كلية الهندسة | جامعة مونديابوليس | المغرب, School of Engineering | Mundiapolis University | Morocco

School of Engineering | Mundiapolis University | Morocco

Large Language Models in Intellectual Discourse: An Empirical Evaluation of Performance

Authors

DOI:

Keywords:

Abstract

Author Biography

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Language

SidebarMenu

Publication Information

Publisher

Address

Editorial Board

Journal Details

Indexing

Latest publications

Journal Publisher

Contact Us

Information