{rfName}
A

License and use

Icono OpenAccess

Altmetrics

Grant support

The work leading to these results was supported by the ASTOUND project (101071191 HORIZON-EIC-2021-PATHFINDERCHALLENGES-01) funded by the European Commission. In addition, the Spanish Ministry of Science and Innovation through the projects AMIC-PoC, BeWord, GOMINOLA (PDC2021-120846-C42, PID2021-126061OB-C43, PID2020-118112RB-C21/AEI/10.13039/501100011033 and PID2020-118112RB-C22/AEI/10.13039/501100011033, funded by MCIN/AEI/10.13039/501100011033, and by the European Union "NextGenerationEU/PRTR"). We also want to give thanks to MS Azure OpenAI access (especially to Irving Kwong) for their sponsorship to generate and evaluate the dialogues with OpenAI models.DAS:The custom code used for creating the dataset is available online: https://github.com/eic-astound-ai-project/artGenEvalPlatform.

Impact on the Sustainable Development Goals (SDGs)

Share

August 11, 2024
Publications
>
Article

A dataset of synthetic art dialogues with ChatGPT

Publicated to:Scientific Data. 11 (1): 825- - 2024-07-27 11(1), DOI: 10.1038/s41597-024-03661-x

Authors: Gil-Martin, Manuel; Luna-Jimenez, Cristina; Esteban-Romero, Sergio; Estecha-Garitagoitia, Marcos; Fernandez-Martinez, Fernando; D'Haro, Luis Fernando

Affiliations

Univ Politecn Madrid, Informat Proc & Telecommun Ctr, Speech Technol & Machine Learning Grp THAU Grp, ETSI Telecomunicac, Madrid 28040, Spain - Author

Abstract

This paper introduces Art_GenEvalGPT, a novel dataset of synthetic dialogues centered on art generated through ChatGPT. Unlike existing datasets focused on conventional art-related tasks, Art_GenEvalGPT delves into nuanced conversations about art, encompassing a wide variety of artworks, artists, and genres, and incorporating emotional interventions, integrating speakers' subjective opinions and different roles for the conversational agents (e.g., teacher-student, expert guide, anthropic behavior or handling toxic users). Generation and evaluation stages of GenEvalGPT platform are used to create the dataset, which includes 13,870 synthetic dialogues, covering 799 distinct artworks, 378 different artists, and 26 art styles. Automatic and manual assessment proof the high quality of the synthetic dialogues generated. For the profile recovery, promising lexical and semantic metrics for objective and factual attributes are offered. For subjective attributes, the evaluation for detecting emotions or subjectivity in the interventions achieves 92% of accuracy using LLM-self assessment metrics.

Keywords

Quality education

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Scientific Data due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2024 there are still no calculated indicators, but in 2023, it was in position 15/135, thus managing to position itself as a Q1 (Primer Cuartil), in the category Multidisciplinary Sciences.

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2025-10-24:

  • WoS: 1
  • Scopus: 1

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-10-24:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 16.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 19 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 0.5.
  • The number of mentions on the social network X (formerly Twitter): 1 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/82985/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 138
  • Downloads: 33
Continuing with the social impact of the work, it is important to emphasize that, due to its content, it can be assigned to the area of interest of ODS 4 - Quality Education, with a probability of 82% according to the mBERT algorithm developed by Aurora University.

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (GIL MARTIN, MANUEL) and Last Author (D'HARO ENRIQUEZ, LUIS FERNANDO).

the author responsible for correspondence tasks has been GIL MARTIN, MANUEL.