Comparative Analysis of City Image Promotional Videos of Two Places from the Perspective of Visual Narrative Theory

: Based on the visual narrative theory of Painter, this study focuses on Anhui global promotional video "Splendid Anhui welcoming the World" and Chengdu FISU Games promotional video " The Countdown to Chengdu 3000 Years". A multimodal comparative analysis is conducted on the two promotional videos from the perspectives of interpersonal meaning, conceptual meaning, and discourse meaning, in order to explore the images presented by promotional videos of different cities. It is recommended to increase emotional involvement in Anhui promotional videos, try to incorporate character stories, and use more intermediary perspectives to experience and feel the charm of the region from a micro perspective, because this visual narrative method is more likely to generate emotional collisions and sparks with the audience. Promotional videos should not only promote the hardware of a city or region, but most importantly, convey the cultural connotations, spiritual style, and life experience.


Introduction
In the 1950s, Halliday [1] proposed that language has three major meta functions, namely conceptual function, communicative function, and discourse function.Based on these three meta functions, Kress and Van Leeuwen [2] extended the theory that exists in the field of language to the visual level and proposed a visual grammar theory centered on reproducing meaning, interactive meaning, and compositional meaning.Consequently, discourse analysis extended from single modal analysis of language to the analysis of both language and image modalities.In 2001, Kress and Jewitt first proposed the concept of multimodal discourse analysis, thus initiating a trend in multimodal discourse research.Multimodal discourse refers to the phenomenon of using various senses such as auditory, visual, and tactile senses to communicate through various means and symbolic resources such as language, images, sound, and actions.Scholars such as Painter, Martin, and Unsworth [3] have conducted research and extended the theory of visual grammar, proposing a visual narrative theory centered on interpersonal meaning, conceptual meaning, and textual meaning, which is more suitable for analyzing dynamic discourse such as promotional videos and movies.
With the development of society and information technology, the dissemination methods of foreign promotion and cultural exchange in various countries are gradually presented to the public in a "multimodal" manner.Multimodal discourse comprehensively utilizes people unique multiple senses and transmits information through various symbolic resources such as images, sounds, and actions.Therefore, the advantages of multimodal analysis are increasingly being valued, and the wider application multimodal discourse analysis is getting.One of its applications is the analysis of promotional videos based on multimodal analysis.Liang Bing and Jiang Ping [4] conducted a research on various promotional videos using visual grammar as the theoretical framework.Pan Yanyan and Zhang Hui [4] interpreted promotional videos based on cognitive linguistics theory.There is relatively little research using the visual narrative theory of Painter, and even less research comparing and analyzing promotional videos of different cities from the perspective of visual narrative.

The Theoretical Framework of Visual Narrative
On the basis of visual grammar, three renowned systemic functional linguists, Painter, Martin, and Unsworth, proposed the theoretical framework of visual narrative (Figure1) in Reading Visual Narrates: Image Analysis of Children's Picture Books, which includes three parts: interpersonal meaning, conceptual meaning, and textual meaning.Interpersonal meaning mainly includes the focalization system, the pathos system, and the ambience system; Conceptual meaning mainly includes character representation, inter event relationships, and inter context relationships; Textual meaning mainly includes layout, framing, and focus.This book has made revisions and supplements to visual grammar, mainly reflected in two aspects: firstly, the research object extends from a single image to complex multimodal discourse containing multiple images, sounds, actions, language and text; The second is to supplement the interactive relationship between images and readers on the basis of visual grammar, considering more of the reader's feelings and aesthetic experiences, paying more attention to the interactive relationship between the text, the author, and the reader, and interpreting the emotional contagion in the text at a deeper level.

Research Objects and Research Methods
In April 2017, Anhui Global Promotion Event of the Ministry of Foreign Affairs was held in the "Blue Hall" of the South Building of the Ministry of Foreign Affairs, synchronized with the new release of Anhui Global Promotion Film "Splendid Anhui Welcoming the World", which shocked the world."Today, we are all salespeople from Anhui," Foreign Minister Wang Yi praised Anhui.Anhui is culturally rich and innovative.This promotional video showcases Anhui's beautiful natural scenery, precious world cultural heritage, and the tremendous changes and technological innovations it has brought to Anhui's economic development.The film starts with the scenery of Mount Huangshan, a famous mountain in the world.The sea of clouds, strange peaks, strange rocks and other natural landscapes are so beautiful that the audience can truly feel Xu Xiake's feeling of "returning from the five mountains without looking at the mountains, and returning from Mount Huangshan without looking at the mountains".The Hongcun Village, a picturesque rural area in China, and the Xidi Ancient Village, known as the Peach Blossom Garden, are depicted in sequence in the film.The classic Huizhou style ancient houses are backed by beautiful green mountains, and the villages are well arranged.The fields around the villages are covered with golden rapeseed flowers, presenting a charming rural scenery.Anhui, as the starting point of rural reform and development, is heading towards the world and resolutely breaking new paths.Now, Anhui has achieved remarkable results in leading science and technology and promoting the development of reform and opening up.The scenes of Xiaogang Village, the slowly moving China Europe freight train, the speeding high-speed railway, iFlytek's robot Xiao Ai, and Anhui's high-tech laboratory allow viewers to vividly feel the potential and strong momentum of Anhui's development.
A netizen commented on the promotional film "Countdown to 3000 Years of Chengdu" for the 2021 Chengdu FISU Games promotional video: "The promotional film of Chengdu is really the ceiling of the industry."From "Sanxingdui" to "Three Kingdoms Shu Land", from "Dujiangyan Irrigation Project" to "Du Fu Thatched Cottage", from "the hometown of pandas" to "the city of hotpot", Chengdu carries huge historical heritages and unique cultural connotations, which is an inexhaustible treasure of the city.The popularity of "Chengdu Countdown 3000 Years" on the internet is inseparable from the profound "Chengdu flavor" in historical coordinates.It not only selects familiar stories such as Li Bing and his son building Dujiangyan Irrigation Project Water Control, but also combines the "cold knowledge" of Zhuge Liang who set up a "golden official" to make Shu Brocade famous overseas.It creatively tells the unique history of this city and people's feelings for this land in the changing times, which enables audience to have empathy and resonance for Chengdu, which is very impressive.

Interpersonal significance 4.1.1. Focusing system
The focus system focuses on the description of characters in the image.The system has two main focuses: firstly, whether there is interaction between the characters in the image and the reader?Secondly, what kind of reading perspective does the image provide for readers to analyze the interaction between the reader and the image?

Interaction: Contact and Observation
In terms of interaction, there are two types of images: contact and observation.At the level of interaction, the main focus is on whether there is interaction between the audience and the characters in the image.If there is interaction, it is a contact type image.If there is no interaction, it is an observation type image, which is reflected by whether there is eye contact between the people and objects in the image and the audience.After comparative analysis, it was found that the use of contact scenes is more common in "Countdown to Chengdu 3000 Years".Painter believes that the use of contact based on visuals can create an effect similar to actual eye contact between the characters in the image and the audience, allowing the audience to immerse themselves more emotionally and integrate into the specific storyline.However, "Splendid Anhui Welcoming the World" is mainly presented through spectator visuals, allowing the audience to observe and understand beyond the story, allowing them to objectively judge and establish the city image constructed in Anhui's global promotional film.

Perspective: Intermediary and Non Intermediary
The perspective refers to the perspective provided by the screen to the audience, which is divided into intermediary perspective and non intermediary perspective.In an intermediary perspective, the audience can observe from the perspective of the characters in the picture, while a non intermediary perspective means that the audience's perspective is not consistent with the characters in the picture.The intermediary perspective can be further divided into the direct intermediary perspective and the triggering intermediary perspective.For ease of understanding, the concept of subjective and objective shots in movies is introduced, with an intermediary perspective referring to subjective shots and an non intermediary perspective referring to objective shots.Direct subjective perspective refers to the lens in which the audience observes from the perspective of the characters in the picture, while trigger subjective lens refers to the lens in which the audience infers the perspective of the characters facing each other in a continuous picture based on the picture.
Research has found that both promotional videos use an intermediary perspective, with "Chengdu Countdown 3000 Years" having a much longer duration than "Splendid Anhui Welcoming the World", and the former using more direct subjective lens duration than the latter.When tracing the history of Chengdu for over 3000 years, a segment corresponds to an important time node.At an important time node, one or more characters in the camera lead the audience to witness the development of Chengdu.In "The Splendid Anhui Welcoming the World", the duration of direct viewing subjective shots is relatively small.The film is interspersed with foreigners practicing calligraphy and Huangmei Opera, and Philip, a Frenchman chose to settle down in Hefei.Through these direct viewing subjective shots, viewers can understand that Anhui is an attractive place.By using midrange and close-up shots taken horizontally, a slightly closer social distance can be established, allowing the audience to be on equal footing with the characters in the image and more intuitively feel the emotions of the characters in the shot.
In the focusing system, city image promotional videos have a certain proportion of camera images that have eye contact with the audience, which can establish direct contact between the audience and the characters in the images.The emotional representation of the characters in the picture, whether it is joy, anger, sorrow, or happiness, can be directly conveyed to the audience through their eyes, establishing clearer emotional cognition.Through this, both parties can reach a higher degree of resonance, making the city image more vivid and three-dimensional.The use of intermediary perspective allows the audience to use close-up shots through the perspective of the characters in the picture, and the emotions and details of the characters and objects in the picture are portrayed very clearly, allowing the audience to directly observe various elements in the picture.

Emotional system
In the emotional system, images involve emotional interaction with readers, mainly achieved through intervention and alienation.Intervention includes three categories: appreciation, empathy, and individuality.In the specific application of promotional videos, empathy scenes refer to the audience being able to immerse themselves in a certain group, feel the emotions of a certain group, and become aware of their shared moral qualities.Individual visuals, on the other hand, depict specific characters, individuals with unique personalities and needs, and are able to clearly perceive their emotions, rather than a particular type of person.Comparing two promotional videos, it was found that the proportion of empathetic and individual scenes in "Chengdu Countdown 3000 Years" was significantly higher than that in "Splendid Anhui Welcoming the World", indicating that the producers of the promotional videos valued the emotional investment of the audience and used intervention scenes to enhance their emotional involvement.In "splendid Anhui Welcoming the World", appreciation images are often used, adopting the simplest style to showcase the beautiful scenery of Anhui.Emotional intervention is relatively less, and there is relatively less emotional interaction with the audience.The audience is not easily emotionally close to the images, and it is also not easy for the audience to feel empathetic in the scene.Therefore, when urban image promotional videos showcase multiple aspects of a region, such as scenery, culture, history, cuisine, economy, etc., they can be approached from the perspective of the emotional system.By using intervention methods, the degree of visual intervention can be improved, with more use of empathy and individual images to narrow the emotional distance with readers.

Conceptual Significance
The conceptual meaning framework focuses on the construction of visual processes, as illustrated by Painter in Kress and Van Leeuwen.Three extensions are proposed on the proposed framework, starting with the representation of characters and relationship boxes Frame; Secondly, outside of the clause unit, propose a framework for visual narrative relationships, and finally propose the background changes of different images.Character representation can be divided into complete representation and metaphorical representation.The complete representation includes the facial features of the character and clearly constructs the character's identity; Metaphorical representation represents the identity of a character through local features such as other parts of the body or contours, forming a metaphorical relationship.Event relationships focus on the connections between different events in visual narrative, which can be divided into two forms: unfolding and projection.Event unfolding can be divided into two situations: whether two events occur in chronological order or simultaneously.Projection refers to the event in one image being seen or imagined by a character in another image, which can be further divided into real event projection and imagined event projection based on their different views and thoughts.Background relationships are about researches on the continuity and changes of adjacent image event in Visual Narrative.When discussing background relationships, one can choose the background or perspective based on whether the situation has changed.When the situation does not change, we can use the same background and change the perspective; When the situation changes, the background should also change due to changes in the characters or events in the picture.

Character representation
The Countdown to Chengdu 3000 Year has no fixed protagonist at the beginning of the film.From the ancient Shu people running to the Li Bing and his son building "Dujiangyan Irrigation Project" to control the water, from Du Fu in the Tang Dynasty scribbling in the Thatched Cottage to two women in the Song Dynasty drinking tea with fans, to the public movement after the founding of New China, the characters in the picture are scattered in different scenes, and the characters in different scenes are not the same, so the image characters are constructed using the method of complete representation.By utilizing the transformation of subjective and objective perspectives and the use of interactive images, viewers can form cognitive and emotional connections with various groups in the city.In the subsequent narrative, a reasonable and clever use of metaphorical character representation is employed.The photographer adopts a close range perspective, focusing on capturing the faces of young people on skateboards and athletes waving their clubs.The entire character is represented through local parts of the body, forming a metaphorical perspective.This combination of complete representation and metaphorical representation fully showcases the positive, enterprising, energetic, and energetic state of the people of Chengdu, while emphasizing the relaxing experience that sports bring, further demonstrating that Chengdu is a vibrant city.There are many benefits to using metaphorical representation during the filming process, such as increasing emotional interaction with the audience, bringing them closer, giving them a feeling of being around, and generating an impulse to yearn for life in Chengdu and go there to relax and experience life.
As far as character representation is concerned in Splendid Anhui Welcoming the World, it is mainly through full representation to clarify the identity of the characters.The photographer used full representation to depict some foreign tourists who climbed Mount Huangshan, a Frenchman Philip who settled in Hefei, a foreign girl who learned to sing Huangmei Opera after Han Zaifen, a Huangmei Opera performance artist, and a foreign male who practiced Wuqinxi, and established a clear identity of the characters.In urban image promotional videos, different groups, professions, and ages are presented to jointly construct character images.However, due to limited duration, most characters only appear 1-2 times.Using complete character representations can allow the audience to perceive and recognize the character images in a timely manner.

Event Relationships
From the perspective of event relationships, the event relationships in Anhui's global promotional films are unfolding relationships.In the process of objectively describing Anhui's natural landscape, long-standing culture, technology and transportation from a third person perspective without intermediaries, the event unfolding relationships are sequential, but it is difficult to determine whether there is a simultaneous relationship.The promotional videos in Chengdu have a typical unfolding relationship in the narrative process.In chronological order from ancient times to the present, this countdown showcases Chengdu's anticipation for the smooth convening of the 2021 FISU Games.

Background Relationships
In terms of background relationship, both promotional videos can choose the background and adjust the shooting perspective according to the changes in the scene, highlighting the content and meaning by changing the lens and switching the background.

Organizational Significance
Regarding the meaning of composition, Painter et al. believe that in visual narrative discourse, the layout framework can better explain and illustrate the relationship between images and text.The layout relationship between graphics and text mainly includes fusion and complementarities.

Fusion
Fusion refers to the mutual fusion between images and text, where language is a part of the image.Fusion is divided into projection and extension.Expansion relationship refers to the expression of individual meanings between images and texts, while also having connections and complementing each other.Observing the layout of the graphic and textual layout (Figure2), the overall picture is set against a light blue sky, shrouded in clouds and mist in the mountains.The words "Open China" and "Splendid Anhui Welcoming the World" are added as supplements in the middle.Therefore, it can be inferred that Anhui is a beautiful mountainous area with a fusion layout.The background image and the text in the center of the picture each have their own meanings, and complement and correspond to each other, Thus, a unified context is constructed for the discourse.

Complementary
Complementary type refers to the proportion of images and text in the entire picture, mainly involving importance, positional relationships, and symmetry.The layout method and emphasis on graphics and text vary greatly, resulting in different visual effects and constructed meanings.From the picture (Figure 3), it can be seen that the word "Chengdu" is significantly larger, and the white font will be more prominent."3000" is also used in brighter yellow.Emphasizing the keywords to make the audience clearly perceive that "Chengdu" is the theme of this chapter, and the numbers are also easily eye-catching.Therefore, the importance of text should be stronger than that of images, which serve as the background.

Conclusion
The visual narrative framework has a certain guiding role in the image analysis of video discourse, and can also provide inspiration for the shooting and production of promotional videos.A comparative analysis of the interpersonal, conceptual, and textual meanings of promotional videos between the two regions shows that they have both certain similarities and many differences.From the perspective of interpersonal meaning, the two promotional videos use individual images to narrow the emotional distance with the audience, resulting in a high degree of involvement of the emotional system; At the same time, they are also adept at using atmosphere systems to regulate color, tone, and naturalness, creating images that conform to the style and connotation of promotional videos, and helping to construct the local image.In terms of focusing on the system, Anhui promotional videos still have significant development space compared to Chengdu's.Chengdu promotional video features a countdown design, utilizing interactive and intermediary perspectives to evoke emotional resonance among the audience; However, Anhui promotional videos mainly present the visuals from an intermediary free perspective, lacking storylines and main storylines, resulting in low emotional engagement and a need to improve promotional effectiveness.From a conceptual perspective, Chengdu promotional videos cleverly utilize close and far range transitions, combining complete and metaphorical representations to highlight key points and accurately convey information and emotions.In the narrative process, the relationship between unfolding and projecting events is combined, and the background is switched according to the changes in the situation to construct a logical event process.Compared to others, Anhui promotional videos lack the participation of the protagonist and present the local customs and traditions from an objective perspective.They lack coherence in the development of the event process, and the connections between the background images are not close enough, making them fragmented.The content of promotional videos may seem complete, but lacks storytelling and humanization.From the perspective of the meaning of the composition, both promotional videos focus on the layout of graphics and text, highlighting the theme of the promotional videos through careful matching of text and images.It is recommended to increase emotional involvement in Anhui promotional videos, try to incorporate character stories, and use more intermediary perspectives to experience and feel the charm of the region from a micro perspective, because this visual narrative method is more likely to generate emotional collisions and sparks with the audience.Promotional videos should not only promote the hardware of a city or region, but most importantly, convey the cultural connotations, spiritual style, and life experience here.This is the core of a city that is unforgettable.
The images analyzed in this study are limited and subjective, and there is a lack of in-depth research on conceptual and textual meanings.I hope that more scholars will pay attention to and deepen their understanding of visual narrative theory in the future, and use this theory to guide the shooting and production of regional image promotional videos, promoting the construction of regional images.

Figure 1 .
Figure 1.Overall Framework of Visual Narrative