Alphonse Raj, Ranjith Gnana Suthakar https://orcid.org/0000-0001-5611-5219
Sandesh, B. J.
Article History
Received: 16 June 2024
Accepted: 31 May 2025
First Online: 24 June 2025
Declarations
:
: The authors declare that they have no competing interests, financial or otherwise, that could be perceived to influence the work reported in this manuscript. No funds, grants, or other support were received during the preparation of this manuscript. The authors have no relevant financial or non-financial interests to disclose.
: This study did not involve human participants or animals. Ethical approval and informed consent were therefore not required.
: To meet the transparency requirements, we have made the source code and dataset publicly accessible. Researchers can access the preprocessing scripts, training procedures, and evaluation metrics via our GitHub repository and the COCO dataset portal. Detailed descriptions and usage instructions are provided to ensure that our methods can be replicated and built upon, promoting further advancements in IC.
: The MS COCO 2017 dataset provides a robust foundation for training and evaluating our IC model. Its diverse and extensive collection of annotated images ensures comprehensive coverage of various visual contexts and objects, which is essential for developing a model capable of generating high-quality captions across different scenarios. Our preprocessing pipeline not only enhances the quality of the training data but also sets a benchmark for future research in the field.The source code and detailed descriptions of the preprocessing steps are made publicly available in our GitHub () repository, and the dataset can be accessed through the COCO () website. These resources ensure transparency and reproducibility, facilitating further advancements in IC research.