Contents
ABSTRACT
Generative Networks commonly referred to as GANs or “Generative Adversarial Networks”, are a category of multi-layer perceptron that has emerged in recent years based on unsupervised type of machine learning. Since their conception, they have been widely studied all over the world due to their vast potential in multiple fields; image synthesis, computer vision, data augmentation and natural language processing, to name a few. It is an ongoing research topic with new network architectures being proposed for various purposes despite the existence of a variety of publications. Generative networks have been widely studied and the research in this field continues to evolve. This study aimed to provide an in-depth analysis of the present state of generative networks for image generation. The analysis is based on a thorough examination of publications, authors, funding sponsors and affiliated institutions from reputable databases like “Scopus” and “Web of Science”. Additionally, the study explored network characteristics such as co-authorship patterns, collaborations between countries, contributions, citations and keywords for a thorough analysis. This analysis revealed the substantial involvement of East Asian countries, notably China, within the realm of generative networks. The primary funding sponsors for this research predominantly hailed from East Asian countries, particularly China, South Korea and Japan. Additionally, the largest volume of documents and authors also originated from China.
INTRODUCTION
Today, machine learning has become a crucial part of our everyday lives to the point that we are using it without even realizing it. Everything from extremely complicated navigation apps with real-time data like Google Maps to the most basic text auto-suggestion and correction keyboard app on your smartphone relies on machine learning for increased precision and accuracy. The booming popularity of ChatGPT and Dall-E which are examples of a category of neural networks called GAN (Generative Adversarial Network), generates data based on what is fed to it as input. It can be used for a wide array of purposes in the field of image processing alone, such as image generation, image translation, text-to-image generation, super-resolution, pose estimation, text generation, video editing, animation etc. The versatility of GANs makes them a promising tool for a wide range of tasks in machine learning and computer vision.
GANsarecreatedbycombiningtwoneuralnetworkswithopposing architectures, known as the generator and the discriminator. They are trained simultaneously in a game-theoretic manner but only the generator is the one being utilised, as the discriminator is simply in the network to train the generator. Random noise is provided to the generator as input which produces synthetic data that resembles the training or the “real” data. Conversely, the discriminator network takes in real data as well as faux data produced by the generator and outputs a probability of the data being real. GANs have been used to produce images that are quite similar to real photographs, perform style transfer and to generate any new data samples that resemble a given dataset. They have shown promising results in a variety of applications and have been the subject of much research and development in recent years.
They can create lifelike images, augment datasets by generating synthetic variations and enhance image resolution. GANs offer creative potential, enabling artists and designers to explore novel visuals and artistic expressions. They also excel in image-to-image translation tasks, converting images from one domain to another, such as turning black-and-white images into color or transforming satellite images into maps. Additionally, GANs are used in face generation and style transfer, allowing the blending of styles from one image with the content of another. In medical applications, GANs can generate synthetic medical images for research and training purposes, which is particularly useful when real data is limited or sensitive.
This bibliometric investigation aids in assessing the current state of research in a specific field by analyzing diverse literature sources, including articles and conference papers, available in various databases. Employing various statistical tools allows for the monitoring of different trends and advancements. The analysis was conducted using tools such as Tableau and VOSviewer.
OBJECTIVES
To assess the volume of research conducted in the field of generative networks from the years 1984 to 2023.
To conduct citation analysis and determine the authors who have made significant contributions to the field.
To identify the most productive countries in terms of research output in the field of generative networks.
To identify the trends and areas of focus in publications based on affiliated institutions and funding sources.
LITERATURE REVIEW
“Generative Adversarial Networks” are a distinct category of neural networks that were first proposed by Goodfellow and co-authors in 2014.[1] The fundamental concept behind GANs is rooted in a game-like scenario involving 2 players, where the total gains they achieve always add up to zero. This game is structured as a zero-sum framework, meaning that any gain or loss in utility experienced by one player is exactly offset by the opposite player’s loss or gain in utility. They are multi-layer networks with layers that can be either fully connected or convolutional. Although they don’t have to be directly invertible, the generator and discriminator networks must be discriminable. The discriminator network is a mathematical function that outputs a prediction based on the probability that a picture is a part of the real data as opposed to the synthetically generated image. When the discriminator is at its best, it may possibly be frozen, ie. it may not adjust its weights or biases at all and the generator may keep learning to reduce the discriminator’s accuracy. The generator or discriminator itself may not always be the driving force behind training GANs.
The oldest paper found in the Scopus database, titled “3-D recognition of randomly oriented parts” by I. Walter and H. Tropf,[2] was published in 1984 at the “Proceedings of SPIE – The International Society for Optical Engineering”. The paper’s primary emphasis was on utilizing contours and a state-space search approach to create lifelike synthetic images. This inspired the research publications that followed decades later in the early 2000s. One of such papers was “Template-based generation of road networks for virtual city modelling” by J. Sun and co-authors[3] published in 2002 which focused on generating images for self-driving transportation by developing a virtual-reality rule-based system that took image-derived templates and returned a virtual traffic network.
Today, we are observing a sudden increase in not only research but also implementation of generative networks due to the availability of resources which streamline the development and deployment process. Another reason is the increased popularity of generative networks outside of research circles as well with a variety of tools easily available.
Types
There are various types of GANs with respect to their architecture and how the neurons in the adjacent layers are connected to each other.
Fully Connected GAN
A neural network with all its neurons in a layer “fully” connected with every neuron in the next layer that was employed for both the generator and discriminator in the initial GAN architectures. Three relatively straightforward image datasets-MNIST (handwritten digits), CIFAR-10 (natural pictures) and the Toronto Face Dataset-were used to test this Type of Design (TFD).
Convolutional GAN
As compared to a fully connected one, convolutional neural networks are well suited for image data. It utilises CNN (Convolutional Neural Network) as its core component and inspiration. Training GANs withthe same techniques as supervised learning can be challenging due to various reasons such as mode collapse, sensitivity to hyperparameters, vanishing gradients etc. The Laplacian Pyramid of Adversarial Networks (LAP-GAN)[4] and Radford and co-authors[5] proposed a subcategory of GAN architectures named DCGAN, which makes it possible to train a network comprising of a deep convolutional generator and a discriminator. Additionally, Wu and colleagues[6] introduced GANs that use volumetric convolutions to synthesize 3D data samples and developed a method to generate 3D variations of objects based on input in the form of 2D images.
Conditional GAN
Mirza and co-authors[7] extended the (2D) GAN architecture to a type that uses additional input variables to guide the generation process. These additional variables can include class labels, attributes, or other structured data. This approach offers the advantage of producing improved representations for generating multi-modal data. In addition to the random noise input that traditional GANs use, InfoGAN[8] adds a set of input variables that are learned by the network during training. These learned variables can represent high-level attributes of the generated output, such as the orientation of an object in an image.
InfoGAN achieves the capability of discovering and representing underlying factors of variation in the data by maximizing the mutual information between the learned variables and the generated output. This results in more controllable and interpretable generation of diverse outputs. InfoGAN finds application in various domains, such as image generation, data compression and data representation learning.
GAN with Inference Model
GANs face a limitation in mapping a given observation to a vector in the latent space, which has led to the proposal of various techniques for inverting the generator of pre-trained GANs.[9,10] Among these, ALI (Adversarially Learned Inference)[11] and Bidirectional GANs[12] have been introduced as relatively simple yet effective methods. These approaches incorporate an inference network consisting of an encoder and a decoder, along with discriminators that assess pairs of joint (data, latent) values. The discriminator’s role is to determine whether a given pair corresponds to a genuine tuple comprising a real image sample and its encoding or a fake image sample and its corresponding latent-space input used for generator synthesis.
However, it has been observed that the fidelity of reconstructed data samples generated using ALI/BiGAN can sometimes be subpar. To address this, researchers have explored the idea of enhancing sample fidelity by introducing an additional adversarial cost that considers the distribution of data samples and their corresponding reconstructions.[13]
Adversarial Autoencoder (AAE)
Adversarial Autoencoder (AAE) is a type of autoencoder that uses an adversarial loss to learn a compact and continuous representation of the input data. An autoencoder is a neural network that learns to reconstruct the input data by compressing it into a lower-dimensional latent space and then decoding it back into the original input shape. AAE adds an adversarial component to the standard autoencoder architecture, allowing the model to learn a more meaningful latent representation of the data. In AAE, the encoder maps the input data to a latent representation and the decoder maps the latent representation back to the original input data. The adversarial loss is used to encourage the latent representation to follow a specific distribution, such as a Gaussian distribution, which helps to smooth the latent space (latent-space GAN)[14] and make it more continuous. The discriminator is used to distinguish between real latent representations and fake ones generated by the encoder. The generator (encoder-decoder) is trained to produce latent representations that can fool the discriminator into thinking they are real. AAE has been used in various applications, such as image generation, data compression and data representation learning. It has shown promising results in generating high-quality and diverse images while maintaining a smooth and continuous latent space. This method is similar to a Variational Autoencoder (VAE),[15] but instead of using the KL-divergence term in the loss function, it uses the latent-space GAN.
Applications
Generative networks can be well utilised for creating sample data with a similar distribution as the real data, such as when creating photorealistic photos. The commonly faced drawback of inadequate training data for supervised learning can also be managed with GANs. Moreover, GANs have been used in voice and language processing, including the creation of dialogues.
Image Generation and Computer Vision
GANs can be used to produce samples of images with the same number of actual images, such as SRGAN[16] for image super-resolution, BEGAN[17] for face samples of good quality of resolution 128×128 (as shown in Figure 1) and Santana et al. for driving scenarios. The experiments conducted have revealed that GANs have the ability to generate road images that appear authentic and this can be utilized for autonomous driving with unsupervised or semi-supervised learning. Gou and colleagues suggest that both synthetic and real images be utilized, although each of these types of images has its own limitations. Shrivastava et al.[18] have proposed SimGAN for learning from unsupervised and simulated images, referred to as S+U learning, as well as “Two-Pathway GAN” (TP-GAN) for creating realistic images with a frontal view simply from one image of the face. TP-GAN requires a set of paired examples of frontal view images and face images with various poses, while CycleGAN, which is a versatile model for image-to-image translation without paired examples, can be applied to various domains.
Language Processing
GANs have been recently used in speech generation and natural language processing. Li and colleagues.[19] Use GANs to capture natural speech and generate text based on what is being spoken. To generate human language speech, poems and even music, SeqGAN[20] uses reinforcement learning. Pascual and co-authors propose SEGAN for speech enhancement, a type of GAN that is designed to improve the quality of speech signals. Unlike traditional methods for speech enhancement that rely on handcrafted features and model assumptions, SEGAN is capable of learning representations of speech signals in an unsupervised manner, allowing it to adapt to a wide category of speech conditions and noise. Reed and co-authors proposed StackGAN for photorealistic image generation, a stacked GAN architecture with text embeddings as input, allowing it to generate images from textual descriptions. The model was trained on large-scale datasets such as COCO and ImageNet and has shown impressive results in generating images of birds and flowers from text descriptions.
Experiments revealed that it can produce nearly indistinguishable images simply from descriptions in textual form, but they struggle at generating necessary and minute details. Examples of images produced by StackGAN from text-based descriptions are shown in Figure 2.
Miscellaneous Applications
GANs can be used in a range of applications, such as reinforcement learning, imitation learning and actor-critic based methods. Hu and colleagues[22] have introduced MalGAN, a GAN-based approach that creates adversarial malware samples which can bypass black-box machine learning-based models. Chidambaram et al. have proposed a framework for style transfer, known as “Style Transfer Generative Adversarial Networks” (STGANs). In addition, Choi et al. have developed a medical GAN called medGAN, which generates realistic Electronic Health Records (EHRs). The effectiveness of medGAN has been demonstrated through experiments that show comparable performance of synthetic EHR datasets to real data in terms of medical expert review and predictive modelling.
As we have seen, Generative Adversarial Networks (GANs) supports a wide variety of models and applications, ranging from fully connected and convolutional GANs to more specialized forms such as conditional GANs and adversarial autoencoders.
The evolution and diversification of these networks underscore their potential in generating high-quality, synthetic data across different domains. The versatility of GANs has driven their adoption not only in image generation and computer vision but also in speech generation, natural language processing and in cybersecurity. The innovation in research and practical applications highlights the critical role of GANs in advancing artificial intelligence. Given the extensive development and diverse applications of generative networks, it is crucial to systematically analyze the research landscape to understand the scope and impact of these advancements. Further section lists out the objective of the research study.
MATERIALS AND METHODS
To fulfil the objectives set in “Objectives” section research methodology is followed which is shown in figure 3 and explained further.
Initially the problem is identified and discussed in “Introduction” section and to provide more clarity a thorough review of existing literature related to neural networks, particularly focusing on generative networks is conducted in “Literature Review” section. Through which objectives for the study were finalized mentioned in “Objectives” section.. Academic database as Scopus and Web of Science were considered for the study (in “Significant Keywords” Section). The relevant papers specific to generative network were filtered for further data analysis detailed in “Statistical Analysis” section and network analysis is discussed in “Network Analysis” section. Network Analysis technique is used to visualize various relationships like authors and publication using VOSViewer. “Discussion” section helps to understand research clusters and their collaborations. Interpretation of analysis followed by conclusion and future recommendation is discussed in “Conclusion” section.
Significant Keywords
The approach to keyword selection in both Scopus and WoS is consisting of all primary keywords. These keywords are presented in a Table 1 covering the period from 1984 to 2023.
Primary Keywords | AND | “generative” |
“networks” | ||
“image” | ||
“generation” |
Initial Search Results
The preliminary search using the predefined keywords yielded 1481 documents from Scopus and 2899 documents from WoS. The majority of these documents, i.e. 1405 from Scopus and 2862 from WoS are in the English language (as shown in the table). All types of papers, regardless of their publication status, were included in the statistical analysis.
STATISTICAL ANALYSIS
Publication Trends by Year
Figures 4 and 5 illustrate the annual publication trends of generative networks for image generation in the Scopus database from 1984 to 2023 and in the WoS database from 2013 to 2022, respectively. The figures reveal a gradual rise in the number of publications since around 2016-17. Interestingly, there was a significant surge in research activities in this field in the year 2021. The steady growth observed in both the databases signifies the advancements happening in the domain of generative networks for image generation.
Type of Document
Figure 6 indicates that approximately half of the documents in the Scopus database are conference papers, followed by articles comprising 44.5% of the total. Similarly, Figure 7 shows that the WoS database also contains a nearly equal number of articles and proceeding papers. Additionally, WoS includes a substantial share of early access and review articles.
Sources
Figure 8 presents the statistics for sources of generative networks publications related to image generation from the Scopus database. The data reveals that the “Proceedings of SPIE – The International Society for Optical Engineering” has been involved in research in this field for the longest duration since 1984. However, “Lecture Notes in Computer Science”, including “Subseries Lecture Notes in Artificial Intelligence” and “Lecture Notes in Bioinformatics”, as well as “Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition”, experienced a sudden growth in publications since 2017, despite having a shorter time period of research activity. On the other hand, the source IEEE Access reached its peak in 2020 but appears to have been declining since then.
Prominent Subject Areas
Figures 9 and 10 displays the subject areas of publications related to generative networks for image generation. Approximately 40% of the documents fall under the category of Computer Science, followed by Engineering at 17.8% and Mathematics at 10.9%. Other subject areas like Physics and Astronomy, Medicine and Earth Sciences are less prevalent in comparison. Similar distribution is observed in the WoS categories as well, where the majority of publications are sub-divisions of Computer Science, such as Artificial Intelligence, Information Systems and Theory Methods. Engineering, specifically Electrical and Electronic Engineering, also holds a substantial portion in this database. Another noteworthy category is Imaging Science and Photographic Technology, as generative works are being applied for generating images in this field.
Funding Sponsors
Figure 11 depicts the top 10 funding sponsors, with “National Natural Science Foundation of China” leading by a huge margin at almost 1000 documents, followed by “National Key Research and Development Program of China”. Majority of these sponsors are originated from East Asian countries, namely China, South Korea and Japan.
Countries
As observed from Figures 12 and 13, it is evident that China is leading the research in the field of image generation. The highest funding sponsors are Chinese organisations as well as the authors with the highest number of documents in this field are of Chinese origin. Chinese is also the language with the second most documents, as observed in Tables 2 and 3.
Sl. No. | Language | No. of Documents |
---|---|---|
1 | English | 1405 |
2 | Chinese | 64 |
3 | Turkish | 5 |
4 | Japanese | 3 |
5 | Russian | 2 |
6 | Korean | 2 |
Total | 1481 |
Sl. No. | Language | No. of Documents |
---|---|---|
1 | English | 2862 |
2 | Chinese | 26 |
3 | Turkish | 5 |
4 | Russian | 3 |
5 | French | 1 |
6 | Korean | 1 |
7 | Unspecified | 1 |
Total | 2899 |
Affiliated Institutions
As mentioned in the previous section, Chinese institutes continue to dominate the affiliations in the field of image generation using generative networks. In the Scopus database (see Figure 14), all the top 10 institutions are from China. In contrast, the Web of Science database includes one university based in the USA (see Figure 15), but the rest of the top affiliations are also from Chinese institutions. This further underscores the significant contribution of Chinese research organizations in advancing the field of image generation through generative networks.
Authors
As anticipated, the analysis reveals that all the top authors in both the Scopus and WoS are of Chinese origin. This correlation explains the dominance of Chinese funding sponsors and affiliated institutes, as these influential authors are associated with Chinese institutions. Their active contributions to research in the field of image generation using generative networks have positioned China at the forefront of this area of study (see Figures 16 and 17).
NETWORK ANALYSIS
Network analysis plays a crucial role in helping researchers identify the members of a network, understand their relationships and identify the most influential members. In the context of generative networks for image generation research, network analysis can be used to examine various aspects
Collaborative Countries
Examining collaborations between countries can shed light on international cooperation in the domain of generative networks for image generation. It helps understand the global distribution of research efforts and partnerships.
Top Cited Documents
Identifying the most highly cited publications allows researchers to recognize seminal works and influential contributions in the field. These key documents can shape the direction of research and provide a foundation for future studies.
Keyword Co-occurrence Analysis
Analysing the co-occurrence of keywords in publications can reveal important thematic connections and trends within the research. It helps understand the prevailing topics and focus areas within the field.
By conducting such network analyses on articles at the final published stage in the Scopus database, researchers can gain valuable insights into the contributions and collaborations of different countries and authors in the field of generative networks for image generation. These insights can aid in understanding the current state of research, identifying potential research partners and guiding future research endeavours.
Collaborative Countries Network Analysis
Based on the downloaded data from Scopus, documents originated from 68 countries. Out of these, 40 countries met the threshold of having at least 5 documents each and were selected for visualization using the VOSviewer tool. The size of each cluster in Figure 18 indicates the number of publications within that group. The analysis resulted in the formation of 11 distinct clusters. The strength of collaborative relationships between countries is represented by the thickness of the lines connecting them.
For instance, China exhibits a robust cooperative relationship with Hong Kong, followed by the USA. The thickest line is observed between India and the USA, with Australia following closely behind. Among all the countries, China stands out as the dominant force in research within this field, evident from the largest circle representing it in the visualization.
It can be observed Table 4 that documents originating from China have been cited the most, making it the most productive country in this field by a huge margin. Moreover, it is leading by a wide margin followed by USA and UK. It is also worth noting that despite the huge gap for number of documents between China and USA, the number of their citations is not that far off. Japan has a higher number of documents compared to UK, Hong Kong and Italy yet the number of citations is lower. It may be due to the publications from Japan primarily being in Japanese.
Country | Documents | Citations |
---|---|---|
China | 731 | 3884 |
United States | 184 | 2537 |
United Kingdom | 56 | 1306 |
Hong Kong | 35 | 1302 |
Italy | 32 | 816 |
Japan | 89 | 615 |
Germany | 53 | 578 |
South Korea | 92 | 480 |
Canada | 39 | 457 |
Portugal | 8 | 262 |
Keyword Co-occurrence Analysis
Keywords provide essential information about an article’s methods and objectives. Keyword co-occurrence refers to the simultaneous appearance of more than two keywords in the same document. From the pool of 1481 publications, a total of 305 keywords were extracted, requiring a minimum occurrence of 10 times in VOSviewer. The visualization in Figure 19 displays the keywords, with circle size directly proportional to their occurrence frequency. The lines connecting the circles indicate how strong the relationship between keywords is, forming 5 distinct clusters denoted by different colors.
For example, the red cluster primarily focuses on generative adversarial networks, their architectures and applications. The green cluster encompasses keywords related to model training and performance evaluation. Analyzing these keywords can aid newcomers in finding relevant papers for their research pursuits.
Furthermore, we have provided a compilation of the top 20 keywords in Table 5. Notably, terms like convolutional neural networks and computer vision are among those listed. In the context of image generation using generative networks, the primary goal is to train the generator to create artificial images based on user specifications. The table also includes various image processing operations, such as generation, enhancement, synthesis and translation.
Rank | Keyword | Occurrences | Total link strength |
---|---|---|---|
1 | Generative adversarial networks | 1074 | 10316 |
2 | Image generations | 848 | 8176 |
3 | Adversarial networks | 490 | 4500 |
4 | Deep learning | 457 | 4942 |
5 | Image enhancement | 410 | 4355 |
6 | Generative adversarial network | 367 | 3856 |
7 | Image generation | 314 | 2974 |
8 | Computer vision | 294 | 2789 |
9 | Generative model | 223 | 2019 |
10 | Semantics | 213 | 2231 |
11 | Image processing | 177 | 2246 |
12 | Images synthesis | 151 | 1690 |
13 | Learning systems | 149 | 1666 |
14 | Convolutional Neural networks | 131 | 1555 |
15 | Textures | 124 | 1334 |
16 | Convolution | 120 | 1379 |
17 | Network architecture | 110 | 1186 |
18 | Image translation | 109 | 1044 |
19 | Article | 104 | 1748 |
20 | Medical imaging | 98 | 1283 |
Most Cited Articles
Citation analysis is a valuable method for gauging the influence of a research paper, as it reveals the significance of an article. Table 6 displays the ten most frequently cited papers obtained from the Scopus database. Among these, “DRAW: A recurrent neural network for image generation” stands out as the most cited article, likely due to its status as the oldest paper in the downloaded data, providing valuable insights into its enduring impact.
Rank | Document Title | Year | Cited By |
---|---|---|---|
1 | “DRAW: A recurrent neural network for image generation”. | 2015 | 711 |
2 | “StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks”. | 2019 | 496 |
3 | “End-to-End Adversarial Retinal Image Synthesis”. | 2018 | 253 |
4 | “GAN-based synthetic brain MR image generation”. | 2018 | 184 |
5 | “Closed-Form Factorization of Latent Semantics in GaNs”. | 2021 | 149 |
6 | “Learning to Generate Chairs, Tables and Cars with Convolutional Networks”. | 2017 | 145 |
7 | “CFGAN: A generic collaborative filtering framework based on generative adversarial networks”. | 2018 | 135 |
8 | “StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows”. | 2021 | 132 |
9 | “Inverting the Generator of a Generative Adversarial Network”. | 2019 | 127 |
10 | “Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy”. | 2021 | 125 |
DISCUSSION
Analysis of generative networks for image generation has uncovered important patterns and discoveries in this rapidly advancing area. Information from Scopus and WoS suggests a significant uptick in research activity starting from approximately 2016-17, with a particularly noticeable increase in publications by 2021. This increase signifies the growing interest in generative networks and their widening range of uses, underscoring the dynamic advancement and potential for further progress in the field.
China has become a major contributor, as evidenced by its large number of publications, influential researchers, and significant funding from Chinese institutions. This leadership is further strengthened by extensive partnerships, especially with the United States, and the strong network of Chinese organizations involved in this research. The significant contributions of Chinese researchers and institutions to the field, in terms of both publication volume and citations, highlight their crucial role in advancing generative network technologies.
Analysis of keyword co-occurrence has pinpointed key themes and areas of focus, including training models, evaluating performance, and utilizing generative networks for tasks such as image enhancement and synthesis. This approach provides valuable insights into current trends and potential future directions for research. A comprehensive examination of the research landscape, encompassing trends in publications, types of documents, and collaborative networks, emphasizes the importance of continuous innovation and collaboration to advance the potential and applications of generative networks across various fields.
Figure 20 features a word cloud that depicts the prevalent and crucial keywords within a dataset concerning generative adversarial networks, image processing, and deep learning. Notably, key phrases such as “adversarial network,” “generative adversarial,” and “image generation” are prominently showcased, underscoring their significance. Other noteworthy terms encompass “deep learning,” “neural network,” and “computer vision,” indicating a broad interest in machine learning and image analysis. This visualization offers a concise overview of the primary subjects, aiding researchers, educators, and students in recognizing major patterns, areas of interest, and potential project concepts within the discipline.
CONCLUSION
Through this bibliometric analysis conducted using various tools; it is observed that various authors have contributed in the field of generative networks for image generation in recent years. China is leading the research in this field with the top 10 authors and funding sponsors being of Chinese origin. With new emerging technology and network architectures for generation, the domain is widely unexplored and can be considered to be in its beginning stages. GANs and their potential applications continue to be an active area of investigation. Analysis was performed on the data gathered from Scopus and WoS for image generation using generative networks from the year 1984 to 2023. According to this bibliometric analysis, the conclusions mentioned below have been derived
The number of publications experienced a significant increase in 2017.
A wide variety of publications are available but conference papers and articles are the most common for Scopus and WoS respectively.
Lecture notes from Computer Science and Artificial Intelligence lectures are the highest number of sources.
Computer Science stands out as the major subject area.
China has the highest number of funding sponsors, affiliations, authors.
China is leading the researching by a wide margin.
“Generative Adversarial Networks” is the most common keyword.
The implications from the study are:
Research and Development: Ongoing innovation in generative network models is crucial. Researchers should focus on enhancing training stability and developing more efficieny. architectures to improve the quality of generated images.
Collaboration and Knowledge Sharing: Extensive collaboration, especially between China and the USA, is vital. Encouraging international partnerships can accelerate advancements and result in more robust and versatile generative models.
Practical Uses: Generative networks have a broad range of practical applications, including generating images and augmenting data for training AI models. Sectors such as entertainment, healthcare, and autonomous driving can profit from synthetic data of high quality.
Regulations and Financial Support: Those responsible for making policies and providing funding should back research on generative networks by allocating resources to AI infrastructure and creating environments that encourage innovation and collaboration, in order to stay competitive in this advancing field.
Implications for National Innovation Systems (NIS): ASEAN countries can address local societal needs like education and healthcare for social innovation through generative model. East Asian models may continue highlighting industrial and commercialization of applications. Region’s innovation capacity and competitiveness can be enhanced by fostering collaboration with leading countries working on generative models to support research capabilities in view of NIS policies.
The importance of continuous innovation in generative network models is highlighted by these results. Overcoming current limitations, such as enhancing training stability and creating more efficient architectures to improve the quality of generated images, should be the primary focus for researchers. It will be essential to invest in AI infrastructure and cultivate environments that encourage innovation and collaboration to stay competitive in this swiftly developing field.
However, it is essential to acknowledge the limitations of this study as recommendation for future researchers where they can focus on improving models and evaluating same. Investigating hybrid approaches also been applied in future.
This is a comparatively newer field with a sudden increase in publications only in recent years. Many types of GANs are being proposed for various purposes but they require more computational resources and memory. Further research is required to address this issue.
Cite this article:
Anuja. Generative Networks for Image Generation in East Asia’s Innovation Systems: A Bibliometric Analysis. J Scientometric Res. 2024;13(3s):s22-s38.
ACKNOWLEDGMENT
We express our sincere gratitude to all contributors including affiliated institutions and individuals who supported this study. Special thanks to the Scopus and Web of Science databases for providing access to the bibliometric data. We also acknowledge the use of tools such as Tableau and VOSviewer for analysis.
ABBREVIATIONS
GANs | Generative Adversarial Networks |
---|---|
MSCCS | Master of Science in Computer Science |
WoS | Web of Science |
AAE | Adversarial Autoencoder |
VAE | Variational Autoencoder |
CNN | Convolutional Neural Network |
SRGAN | Super-Resolution Generative Adversarial Network |
TP-GAN | Two-Pathway Generative Adversarial Network |
LAP-GAN | Laplacian Pyramid Generative Adversarial Network |
DCGAN | Deep Convolutional Generative Adversarial Network |
InfoGAN | Information Maximizing Generative Adversarial Network |
ALI | Adversarially Learned Inference |
BiGAN | Bidirectional Generative Adversarial Network |
BEGAN | Boundary Equilibrium Generative Adversarial Network |
SEGAN | Speech Enhancement Generative Adversarial Network |
STGAN | Style Transfer Generative Adversarial Network |
medGAN | Medical Generative Adversarial Network |
EHR | Electronic Health Records |
COCO | Common Objects in Context Dataset |
References
- Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Commun ACM. 2020;63(11):139-44. [CrossRef] | [Google Scholar]
- Walter I, Tropf H. 3-D recognition of randomly oriented parts. Proc SPIE. 1984:0449 [CrossRef] | [Google Scholar]
- Sun J, Yu X, Baciu G, Green M. Template-based generation of road networks for virtual city modeling. VRST ‘02. 2002:33-40. [CrossRef] | [Google Scholar]
- Denton E, Chintala S, Szlam A, Fergus R. Deep generative image models using a laplacian pyramid of adversarial networks. Adv Neural Inf Process Syst. 2015;28(NIPS 2015) [CrossRef] | [Google Scholar]
- Zhu JY, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. 2017:2242-51. [CrossRef] | [Google Scholar]
- Wu J, Zhang C, Xue T, Freeman WT, Tenenbaum JB. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. 2016 [CrossRef] | [Google Scholar]
- Mirza M, Osindero S. Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784. 2014 [CrossRef] | [Google Scholar]
- Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P, et al. InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. Adv Neural Inf Process Syst. 2016:29 [CrossRef] | [Google Scholar]
- Creswell A, Bharath AA. Inverting the generator of A generative adversarial network. IEEE Trans Neural Netw Learn Syst. 2018;30(7):1967-74. [PubMed] | [CrossRef] | [Google Scholar]
- Lipton ZC, Tripathi S. Precise Recovery of Latent Vectors from Generative Adversarial Networks. arXiv preprint arXiv:1702.04782. 2017 [PubMed] | [CrossRef] | [Google Scholar]
- Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, et al. Adversarially Learned Inference. arXiv preprint arXiv:1606.00704. 2016 [PubMed] | [CrossRef] | [Google Scholar]
- Donahue J, Krähenbühl P, Darrell T. Adversarial Feature Learning. arXiv preprint arXiv:1605.09782. 2016 [PubMed] | [CrossRef] | [Google Scholar]
- Li C, Liu H, Chen C, Pu Y, Chen L, Henao R, et al. Towards understanding adversarial learning for joint distribution matching. Alice. Adv Neural Inf Process Syst. 2017:30 [PubMed] | [CrossRef] | [Google Scholar]
- Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial Autoencoders. arXiv preprint arXiv:1511.05644. 2015 [PubMed] | [CrossRef] | [Google Scholar]
- Kingma DP, Welling M. Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114. 2013 [PubMed] | [CrossRef] | [Google Scholar]
- Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-realistic single image super-resolution using a generative adversarial network. 2017:105-14. [CrossRef] | [Google Scholar]
- Berthelot D, Schumm T, BEGAN ML. Boundary Equilibrium Generative Adversarial Networks. arXiv preprint arXiv:1703.10717. 2017 [CrossRef] | [Google Scholar]
- Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R, et al. Learning from simulated and unsupervised images through adversarial training. 2017:2242-51. [CrossRef] | [Google Scholar]
- Li J, Monroe W, Shi T, Jean S, Ritter A, Jurafsky D, et al. Adversarial Learning for Neural Dialogue Generation. arXiv preprint arXiv:1701.06547. 2017 [CrossRef] | [Google Scholar]
- Yu L, Zhang W, Wang J, Yu Y. SeqGAN: sequence generative adversarial nets with policy gradient. AAAI. 2017;31(1) [CrossRef] | [Google Scholar]
- Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, et al. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. 2017:5908-16. [CrossRef] | [Google Scholar]
- Hu W, Tan Y. Generating adversarial malware examples for black-box attacks based on GAN. 2022:409-23. [CrossRef] | [Google Scholar]